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Dear Sir: 

This Appeal Brief, filed in connection with the above captioned patent application, is 
responsive to the Final Office Action mailed on August 15, 2005. A Notice of Appeal was filed 
herein on November 15, 2005. A request for a one-month extension of time is requested 
herewith. Appellants hereby appeal to the Board of Patent Appeals and Interferences from the 
final rejection in this case. 

The following constitutes the Appellants' Brief on Appeal. 
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L REAL PARTY IN INTEREST 

The real party in interest is Genentech, Inc., South San Francisco, California, by an 
assignment of the parent application, U.S. Patent Application Serial No. 09/941,992 recorded 
November 16, 2001, at Reel 012176 and Frame 0450. 

II. RELATED APPEALS AND INTERFERENCES 

The claims pending in the current application are directed to a polypeptide referred to 
herein as "PROl 112". There exist two related patent appUcations, (1) U.S. Patent Application 
Serial No. 09/989,328, filed November 19, 2001 (containing claims directed to nucleic acids 
encoding PROl 112 polypeptides), and (2) U.S. Patent Application Serial No. 09/990,436, filed 
November 14, 2001 (containing claims directed to antibodies to PROl 1 12 polypeptides). U.S. 
Patent Application Serial No. 09/989,328 (nucleic acid case) has been allowed and the issue fee 
has been paid. The related U.S. Patent Application Serial No. 09/990, 436 application is also 
under final rejection by the same Examiner and based upon the same outstanding rejections, an 
appeal is being pursued independently and concurrently herewith. 

III. STATUS OF CLAIMS 

Claims 119-126 and 129-131 are in this apphcation. 
Claims 1-118 and 127-128 have been canceled. 

Claims 119-126 and 129-131 stand rejected and Appellants appeal the rejection of these 

claims. 

A copy of the rejected claims in the present Appeal is provided as Appendix A. 

IV. STATUS OF AMENDMENTS 

All claim amendments have been entered by the Examiner. 

V. SUMMARY OF CLAIMED SUBJECT MATTER 

The invention claimed in the present application is related to an isolated polypeptide 

comprising the amino acid sequence of the polypeptide of SEQ ID NO: 207, referred to in the 

present application as "PROl 112." The PROl 112 gene was shown for the first time in the 

present apphcation to be significantly amplified in human lung and colon cancers as compared to 

normal, non-cancerous human tissue controls (Example 170). This feature is specifically recited 
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in Claim 124, and carried by all claims dependent from Claim 124. In addition, the invention 
also claims the amino acid sequence of the polypeptide of SEQ ID NO: 207, lacking its 
associated signal-peptide; or the amino acid sequence of the polypeptide encoded by the full- 
length coding sequence of the cDNA deposited under ATCC accession number 209951 
(Claims 124-126 and 129). The invention is further directed to polypeptides having at least 80% 
to 99% amino acid sequence identity to the amino acid sequence of the polypeptide of SEQ ID 
NO: 207; the amino acid sequence of the polypeptide of SEQ ID NO: 207, lacking its associated 
signal peptide; or the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 209951, wherein the nucleic 
acid encoding said polypeptide is amplified in colon tumor (Claims 1 19-123). The invention is 
further directed to a chimeric polypeptide comprising one of the above polypeptides fused to a 
heterologous polypeptide (Claim 130), and to a chimeric polypeptide wherein the heterologous 
polypeptide is an epitope tag or an Fc region of an immunoglobulin (Claim 131). PRO 
polypeptide variants having at least about 80-99% amino acid sequence identity with a full 
length PRO polypeptide sequence, or a PRO polypeptide sequence lacking the signal peptide are 
generally described in the specification at, for example, page 305, line 23 onwards, and percent 
amino acid sequence identity determination is generally described at least at, for example, 
pages 306-308, line 14 onwards. The preparation of chimeric PRO polypeptides (Claims 130 
and 131), including those wherein the heterologous polypeptide is an epitope tag or an Fc region 
of an immunoglobulin, is set forth in the specification at page 374, lines 24 to page 375, line 9. 
Examples 140-143 and page 376, line 12 onwards describe the expression of PRO polypeptides 
in various host cells, including E. coli, mammalian cells, yeast and Baculovirus-infected insect 
cells. 

The amino acid sequence of the native "PROl 112" polypeptide and the nucleic acid 
sequence encoding this polypeptide (referred to in the present application as "DNA57702-1476") 
are shown in the present specification as SEQ ED NOs: 207 and 206, respectively, and in 
Figures 135 and 134, described on pages 294, lines 10-13. The full-length PROl 1 12 polypeptide 
having the amino acid sequence of SEQ ID NO: 207 is described in the specification at, for 
example, on page 17 and pages 127-128 and the isolation of cDNA clones encoding PROl 1 12 of 
SEQ ID NO: 207 is described in Example 57, page 449 of the specification. 
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Finally, Example 170, in the specification at page 539, line 19, to page 555, line 5, sets 
forth a 'Gene Amplification assay' which shows that the PROl 112 gene is amplified in the 
genome of certain human colon cancers (see Table 9 A, page 550-551). The profiles of various 
primary colon tumors used for screening the PRO polypeptide compounds of the invention in the 
gene amplification assay are summarized on Table 8, page 546 of the specification. 

VI. GROUNDS OF REJECTION TO BE REVIEWED ON APPEAL 

1. Whether Claims 1 19-126 and 129-131 satisfy the utility/ enablement requirement 
under 35 U.S.C. §§101/1 12, first paragraph. 

2. Whether Claims 1 19-123 and 130-131 satisfy the written description requirement 
under 35 U.S.C. §112, first paragraph. 

VII. ARGUMENTS 

Summary of the Arsuments 
Issue 1: Utility/ Enablement 

Appellants rely upon the gene amplification data of the PROl 112 gene for patentable 
utility of the PROl 112 polypeptides and their antibodies. This data is clearly disclosed in the 
instant specification in Example 170 which discloses that the gene encoding PROl 112 showed 
siRnificant amplification, ranging fi"om 2.196 fold to 3.364-fold amplification in seven lung 
tumors and the 2.092 fold to 4.807-fold amplificafion in twelve out of fifteen colon tumors. 
Appellants have submitted, in their Response filed August 4, 2005, a Declaration by Dr. Audrey 
Goddard, which explains that a gene identified as being amplified at least 2-fold by the disclosed 
gene amplification assay in a tumor sample relative to a normal sample is usefiil as a marker for 
the diagnosis of cancer , and for monitoring cancer development and/or for measuring the 
efficacy of cancer therapy. Therefore, such a gene is usefiil as a marker for the diagnosis of lunR 
or colon cancer , and for monitoring cancer development and/or for measuring the efficacy of 
cancer therapy. Appellants have also submitted, in their Response filed June 25, 2004, ample 
evidence to show that, in general, if a gene is amplified in cancer, it is more likely than not that 
the encoded protein will be expressed at an elevated level. First, the articles by Omtoft et al, 
Hyman et al, and Pollack et aL collectively teach that in general gene amplification increases 
mRNA expression . Second, the Declaration of Dr. Paul Polakis, principal investigator of the 
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Tumor Antigen Project of Genentech, Inc., the assignee of the present application, shows that, in 
general, there is a correlation between mRNA levels and polypeptide levels . Third, Appellants 
further submit that even if there were no correlation between gene amplification and increased 
mRNA/protein expression, (which Appellants expressly do not concede to), a polypeptide 
encoded by a gene that is amplified in cancer would still have a specific, substantial, and credible 
utility. Appellants submit that, as evidenced by the Ashkenazi Declaration and the teachings of 
Harma and Momin (both made of record in Appellants* Response filed June 25, 2004), 
simultaneous testing of gene amplification and gene product over-expression enables more 
accurate tumor classification , even if the gene-product, the protein, is not over-expressed. This 
leads to better determination of a suitable therapy for the tumor, as demonstrated by a real-world 
example of the breast cancer marker HER-2/neu. Appellants further note that the sale of gene 
expression chips to measure mRNA levels is a highly successful business, with a company such 
as Affymetrix recording 168.3 million dollars in sales of their GeneChip arrays in 2004. Clearly, 
the research community believes that the information obtained from these chips is useful (ie., 
that it is more likely than not informative of the protein level). Therefore, as a general rule, one 
skilled in the art would find it more likely than not that PROl 112 and its antibodies are useful as 
a diagnostic tools for detecting lung or colon tumors. 

The Examiner acknowledges on page 3 of the Final Office Action mailed 
August 15, 2005 that "the data in Table 8 may provide a basis for utility and enablement of 
PROl 1 12 nucleic acid," but contends that the data "does not provide a basis for utility or 
enablement of the claimed polypeptides. The art supports this position by establishing that there 
is no strong correlation between gene amplification and increased mRNA or protein levels." The 
Examiner maintains the rejection based on previously cited references Permica et aL, Konopka et 
al., Hu et aL and Haynes et al. and newly cited references Lian et al, Fessler et al, and Chen et 
al 

The Examiner further quotes the Hittelman reference and adds that "(t)he art recognizes 
that lung epithelium is at risk for cellular damage due to direct exposure to environmental 
pollutants and carcinogens, which result in aneuploidy before the epitheHal cells turn 
cancerous. . .Hittelman teach that damaged, precancerous lung epithelium is often aneuploid." 
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Appellants submit that the teachings of the Examiner's cited references do not 
conclusively establish a prima facie case for lack of utility (as will be discussed in detail below). 
In particular, contrary to the Examiner's interpretation, the data of Haynes et al (see Figure 1) 
and Chen et al. (see Tables I and II of the paper) suggest that a positive correlation does exist 
between gene and protein expression. In addition, the teachings of Hu et al, Lian et al and 
Fessler et al do not show a lack of correlation between mRNA and protein expression for genes 
in general . In fact, these cited references make clear references to various limitations in their 
studies and to their conclusions that were drawn by excluding certain data points. Appellants 
respectfully submit that such conclusions cannot be used to validate the Examiner's conclusions 
regarding the correlation between gene and mRNA/ protein expression in general . Since the 
Examiner has not cited evidence that clearly addresses gene and mRNA/ protein expression in 
general, a prima facie case for lack of utility has not been made. Appellants further submit that, 
even if the amplification of the PROl 112 gene were due to aneuploidy (which Appellants 
expressly do not concede with), the art exemplified by the Hittelman et al reference still 
supports the Appellants' position because it still provides utility for the PROl 112 gene, at least 
as a marker for cancer or precancerous cells or damaged tissue . Accordingly, the PROl 112 gene 
finds utiUty as a diagnostic for cancer or for individuals at risk for developing lung or colon 
cancer. 

Taken together, although there are some examples in the scientific art that do not fit 
within the central dogma of molecular biology that there is generally a positive correlation 
between DNA, mRNA, and polypeptide levels, in general, i n the majority of amplified genes , as 
exemplified by the teachings of Omtoft et al, Hyman et al, Pollack et al, the Polakis 
Declaration, the art overwhelmingly show that gene amplification influences gene expression at 
the mRNA and protein levels . The widespread, art accepted use of information obtained from 
array chips for detecting diagnostic markers lend further support that in general, one of skill in 
the art would reasonably expect in this instance, based on the amplification data for the 
PROl 112 gene, that the PROl 1 12 polypeptide is concomitantly overexpressed and has utility in 
the diagnosis of lung or colon cancer or for individuals at risk for developing lung or colon 
cancer. 
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Accordingly, Appellants submit that when the proper legal standard is applied, one 
should reach the conclusion that the present application discloses at least one patentable utility 
for the claimed PROl 112 polypeptides and its antibodies thereof. Accordingly, one of ordinary 
skill in the art would also understand how to make and use the recited antibodies for the 
diagnosis of lung or colon cancer without any undue experimentation. 

Issue 2: Written Description 

The factors to be considered in evidencing possession of a claimed genus include 
"disclosure of complete or partial structure, physical and/or chemical properties, functional 
characteristics, structure/function correlation, methods of making the claimed product, or any 
combination thereof" Current appHcable case law holds that biological sequences are not 
adequately described solely by a description of their desired functional activities. It is, however, 
well established that a combination of functional and structural features suffices to describe a 
claimed genus, as discussed in the PTO's own Written Description Guidelines, and as set forth in 
Enzo Biochem., Inc. v. Genprobe, Inc. Appellants note that the claims recite structural features, 
namely, 80-99% sequence identity to the native sequence of SEQ ID NO: 207, which are 
common to the genus. The genus of claimed polypeptides is further defined by having a specific 
functional activity for the encoding nucleic acids, namely, that the encoding nucleic acid is 
amplified in colon tumors. The specification provides detailed guidance as to how to identify the 
recited variants of SEQ ID NO: 207, including methods for determining percent identity between 
two amino acid sequences, as well as listings of exemplary and preferred sequence substitutions, 
as well as detailed protocols for determining whether a gene encoding a variant PROl 112 protein 
is amplified in colon tumor. Thus, one of skill in the art could easily identify whether a variant 
PROl 1 12 sequence falls within the parameters of the claimed invention. 

Accordingly, a description of the claimed genus has been achieved by the recitation of 
both structural and functional characteristics. 

These arguments are all discussed in further detail below under the appropriate headings. 



On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 09/992,643 
Attorney's Docket No. 39780-2730 P1C13 




Response to Rejections 

ISSUE 1. Claims 119-126 and 129-131 are Supported by a Credible. Specific and 
Substantial Asserted Utility, and Thus, Meet the Utility Requirement of 35 U.S.C, 
§§101/112, First Paragraph 

The sole basis for the Examiner's rejection of Claims 119-126 and 129-131 under this 
section is that the data presented in Example 170 of the present specification is allegedly 
insufficient under the present legal standards to establish a patentable utility under 35 U.S.C. 
§101 for the presently claimed subject matter. 

Claims 1 19-126 and 129-131 stand further rejected under 35 U.S.C. §1 12, first paragraph, 
allegedly "since the claimed invention is not supported by either a specific and substantial asserted 
utility or a well established utility for the reasons set forth above, one skilled in the art clearly would 
not know how to use the claimed invention." 

Appellants strongly disagree and, therefore, respectfially traverse the rejection. 

A. The Legal Standard For Utility Under 35 U.S.C. §101 

According to 35 U.S.C. §101: 

Whoever invents or discovers any new and useful process, machine, manufacture, 
or composition of matter, or any new and useful improvement thereof, may obtain 
a patent therefor, subject to the conditions and requirements of this title. 
(Emphasis added). 

hi interpreting the utility requirement, in Brenner v. Manson, the Supreme Court held 
that the quid pro quo contemplated by the U.S. Constitution between the public interest and the 
interest of the inventors required that a patent Applicant disclose a "substantial utility" for his or 

2 

her invention, /.e., a utiUty "where specific benefit exists in currently available form." The 
Court concluded that "a patent is not a hunting license. It is not a reward for the search, but 
compensation for its successful conclusion. A patent system must be related to the world of 

3 

commerce rather than the realm of philosophy." 



Brenner v. Manson, 383 U.S. 519, 148 U.S.P.Q. (BNA) 689 (1966). 



Id. at 534, 148 U.S.P.Q. (BNA) at 695. 
^ Id at 536, 148 U.S.P.Q. (BNA) at 696. 
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Later, in Nelson v. Bowler, the C.C.P.A, acknowledged that tests evidencing 
pharmacological activity of a compound may establish practical utility, even though they may 
not establish a specific therapeutic use. The Court held that "since it is crucial to provide 
researchers with an incentive to disclose pharmaceutical activities in as rnany compounds as 
possible, we conclude adequate proof of any such activity constitutes a showing of practical 

utility."^ 

6 

In Cross v. lizuka, the C.A.F.C. reaffirmed Nelson, and added that in vitro results might 
be sufficient to support practical utility, explaining that "/« vitro testing, in general, is relatively 
less complex, less time consuming, and less expensive than in vivo testing. Moreover, in vitro 
results with the particular pharmacological activity are generally predictive of in vivo test results, 

i.e. there is a reasonable correlation there between."^ The Court perceived, "No insurmountable 
difficulty" in finding that, under appropriate circumstances, "/« vitro testing, may establish a 

8 

practical utility." 

The case law has also clearly established that Appellants' statements of utility are usually 

9 

sufficient, unless such statement of utihty is unbelievable on its face. The PTO has the initial 

10 

burden to prove that Appellants' claims of usefulness are not believable on their face. In 
general, an Appellant's assertion of utility creates a presumption of utility that will be sufficient 



Nelson v. Bowler, 626 F.2d 853, 206 U.S.P.Q. (BNA) 881 (C.C.P.A. 1980). 
^ Id, at 856, 206 U.S.P.Q. (BNA) at 883. 

^ Cross V. lizuka, 753 F.2d 1047, 224 U.S.P.Q. (BNA) 739 (Fed. Cir. 1985). 
^ Id, at 1050, 224 U.S.P.Q. (BNA) at 747. 

'id. 

^ In re Gazave, 379 F.2d 973, 154 U.S.P.Q. (BNA) 92 (C.C.P.A. 1967). 
10 Ibid, 
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to satisfy the utility requirement of 35 U.S.C. §101, "unless there is a reason for one skilled in 

11 12 

the art to question the objective truth of the statement of utility or its scope." ' 

Compliance with 35 U.S.C. §101 is a question of fact. ^ The evidentiary standard to be 
used throughout ex parte examination in setting forth a rejection is a preponderance of the 

14 

totality of the evidence under consideration. Thus, to overcome the presumption of truth that 
an assertion of utility by the Appellant enjoys, the Examiner must estabUsh that it is more likely 
than not that one of ordinary skill in the art would doubt the truth of the statement of utility. 
Only after the Examiner made a proper prima facie showing of lack of utility, does the burden of 
rebuttal shift to the Appellant. The issue will then be decided on the totality of evidence. 

The well established case law is clearly reflected in the Utility Examination Guidelines 

("Utility Guidelines"),*^ which acknowledge that an invention complies with the utility 
requirement of 35 U.S.C. §101, if it has at least one asserted "specific, substantial, and credible 
utility" or a "well-established utility." Under the Utility Guidelines, a utility is "specific" when 
it is particular to the subject matter claimed. For example, it is generally not enough to state that 
a nucleic acid is useful as a diagnostic without also identifying the conditions that are to be 
diagnosed. 

In explaining the "substantial utiHty" standard, M.P.E.P. §2107.01 cautions, however, 
that Office personnel must be careful not to interpret the phrase "immediate benefit to the 
public" or similar formulations used in certain court decisions to mean that products or services 

In reLanger, 503 F.2d 1380,1391, 183 U.S.P.Q. (BNA) 288, 297 (C.C.P.A. 1974). 

See also In re Jolles, 628 F.2d 1322, 206 U.S.P.Q. 885 (C.C.P.A. 1980); In re Irons, 
340 F.2d 974, 144 U.S.P.Q. 351 (1965); In re Sichert, 566 F.2d 1 154, 1 159, 196 U.S.P.Q. 209, 
212-13 (C.C.P.A. 1977). 

Raytheon v. Roper, 724 F.2d 951, 956, 220 U.S.P.Q. (BNA) 592, 596 (Fed. Cir. 1983) 
cert, denied, 469 US 835 (1984). 

In re Oetiker, 977 F.2d 1443, 1445, 24 U.S.P.Q.2d (BNA) 1443, 1444 (Fed. Cir. 

1992). 

66 Fed. Reg. 1092(2001). 
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based on the claimed invention must be "currently available" to the public in order to satisfy the 
utility requirement. "Rather, any reasonable use that an applicant has identified for the invention 
that can be viewed as providing a public benefit should be accepted as sufficient, at least with 

16 

regard to defining a 'substantial' utility." Indeed, the Guidelines for Examination of 

17 

Applications for Compliance With the Utility Requirement, gives the following instruction to 
patent examiners: "If the Applicant has asserted that the claimed invention is useful for any 
particular practical purpose . . . and the assertion would be considered credible by a person of 
ordinary skill in the art, do not impose a rejection based on lack of utility." 

B. Proper Application of the Legal Standard 

Appellants submit that the evidentiary standard to be used throughout ex parte 
examination of a patent application is a preponderance of the totality of the evidence under 
consideration. Thus, to overcome the presumption of truth that an assertion of utility by the 
Appellant enjoys, the Examiner must establish that it is more likely than not that one of ordinary 
skill in the art would doubt the truth of the statement of utility. Only after the Examiner has 
made a proper prima facie showing of lack of utility, does the burden of rebuttal shift to the 



Appellants respectfully submit that the data presented in Example 170 starting on 
page 539 of the specification of the specification and the cumulative evidence of record, which 
underlies the current dispute, indeed support a "specific, substantial and credible" asserted utility 
for the presently claimed invention. 

Patentable utility for the PROl 112 polypeptides and its antibodies is based upon the gene 
amplification data for the gene encoding the PROl 112 polypeptide. Example 170 describes the 
results obtained using a very well-known and routinely employed polymerase chain reaction 
(PCR)-based assay, the TaqMan^^ PGR assay, also referred to herein as the gene amplification 
assay. This assay allows one to quantitatively measure the level of gene amplification in a given 
sample, say, a tumor extract, or a cell line. It was well known in the art at the time the invention 



Appellant. 



16 



M.P.E.P. §2107.01. 



17 



M.P.E.P. §2107 11(B)(1). 



-11- 



On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 09/992,643 
Attorney's Docket No. 39780-2730 P1C13 



was made that gene amplification is an essential mechanism for oncogene activation. Appellants 
isolated genomic DNA firom a variety of primary cancers and cancer cell lines that are listed in 
Table 9 (pages 539 onwards of the specification), including primary colon cancers of the type 
and stage indicated in Table 8 (page 546). The tumor samples were tested in triplicates with 
Taqman™ primers and with internal controls, beta-actin and GADPH in order to quantitatively 
compare DNA levels between samples (page 548, lines 33-34). As a negative control, DNA was 
isolated from the cells of ten normal healthy individuals, which was pooled and used as a control 
(page 539, lines 27-29) and also, no-template controls (page 548, lines 33-34). The results of 
TaqMan™ PGR are reported in ACt units, as explained in the passage on page 539, Unes 37-39. 
One unit corresponds to one PGR cycle or approximately a 2-fold amplification, relative to 
control, two units correspond to 4-fold, 3 units to 8-fold amplification and so on. Using this 
PCR-based assay, Appellants showed that the gene encoding for PROl 112 was amplified, that is, 
it showed approximately 1.135-1.775 ACt units which corresponds to 21-135 .2 1.775_ fold 
amplification or 2.196 fold to 3.364-fold amplification in seven lung tumors and 1.065-2.265 
ACt units which corresponds to 21-065 _2 2.265. fold ampUfication or 2.092 fold to 4.807-fold 
ampHfication in twelve out of fifteen colon tumors. 

Appellants point out that the Declaration by Dr. Audrey Goddard provides a statement by 
an expert in the relevant art that "fold amplification" values of at least 2-fold are considered 
significant in the TaqMan'^'^ PGR gene amplification assay. Appellants particularly draw the 
Board's attention to page 3 of the Goddard Declaration which clearly states that: 

It is further my considered scientific opinion that an at least 2-fold increase in 
gene copy number in a tumor tissue sample relative to a normal (i.e., non-tumor) 
sample is significant and usefiil in that the detected increase in gene copy number 
in the tumor sample relative to the normal sample serves as a basis for using 
relative gene copy number as quantitated by the TaqMan PGR technique as a 
diagnostic marker for the presence or absence of tumor in a tissue sample of 
unknown pathology. Accordingly, a gene identified as being amplified at least 2- 
fold by the quantitative TaqMan PGR assay in a tumor sample relative to a normal 
sample is useful as a marker for the diagnosis of cancer, for monitoring cancer 
development and/or for measuring the efficacy of cancer therapy. (Emphasis 
added). 
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Accordingly, the 2.196 fold to 3.364-foid amplification in seven lung tumors and the 
2.092 fold to 4.807-fold amplification in twelve out of fifteen colon tumors would be considered 
significant and credible by one skilled in the art, based upon the facts disclosed in the Goddard 
Declaration. As any skilled artisan in the field of oncology would easily appreciate that this gene 
is a good candidate marker for diagnosing lung or colon tumors and would clearly find utility for 
the PROl 112 gene as a diagnostic for lung or colon cancer or for diagnosing individuals at risk 
for developing lung or colon cancer. 

Further, as discussed below, Appellants had provided ample evidence in the form of 
articles from the art, like Omtoft et ai, Hyman et al. Pollack et ai, and also the Polakis and 
Ashkenazi declarations, to show that, in general, if a gene is amplified in cancer, it is "more 
likelv than not " that the encoded protein will also be expressed at an elevated level. The 
Examiner appears to disregard the ample evidence provided by the Appellants based on 
misinterpretations of the teachings therein, as will be discussed below. Further, it is not a legal 
requirement to establish a necessary correlation between an increase in the copy number of the 
DNA and protein expression levels that would correlate to the disease state or that it is 
imperative to find evidence that DNA amplification is " necessarily " or "always" associated with 
overexpression of the gene product. Appellants respectfully submit that when the proper 
evidentiary standard is applied, a correlation must be acknowledged. The "more likely than not" 
standard is a much lower standard, and Appellants submit that, in fact, this standard is clearly 
met by the instant disclosure. 

C. A prima facie case of lack of utility has not been established 

The Examiner argues based on Pennica et ai, Konopka et ai and Haynes et al that 
"(w)hile the data in Table 9 may provide a basis for utility and enablement of PROl 1 12 nucleic 
acid, it does not provide a basis for utility or enablement of the claimed polypeptides" (Pages 3 
of the Final Office Action mailed August 15, 2005). 

Appellants respectfully submit that, contrary to the Examiner's assertion, none of the 
cited reference conclusively estabUsh a prima facie case for lack of utility for the PROl 1 12 
molecule. For instance, the teachings of Pennica et al are specific to WISP genes, a specific 
class of closely related molecules. Pennica et al showed that there was good correlation 
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between DNA and mRNA expression levels for the WISP-I gene but not for WISP-2 and WISP- 
3 genes. But, the fact that in the case of closely related molecules, there seemed to be no 
correlation between gene amplification and the level of mRNA/protein expression does not 
establish that it is more likely than not, in general, that such correlation does not exist. As 
discussed above, the standard is not absolute certainty . Pennica et al has no teaching 
whatsoever about the correlation of gene amphfication and protein expression for genes in 
general . Similarly, in Konopka et al. Appellants submit that the Examiner has generalized a 
very specific result disclosed by Konopka et al to cover all genes. Konopka et al actually state 
that "[pjrotein expression is not related to amplification of the abl gene but to variation in the 
level of bcr-abl mRNA produced from a single Ph^ template," (See Konopka et al ., Abstract, 
emphasis added). The paper does not teach anything whatsoever about the correlation of protein 
expression and gene amplification in general and provides no basis for the generalization that 
apparently underlies the present rejection. The statement of Konopka et al that "[p]rotein 
expression is not related to amplificafion of the abl gene ..." is not sufficient to estabUsh a 
prima facie case of lack of utility. Therefore, the combined teachings of Pennica et al and 
Konopka et al are not directed towards genes in general but to a single gene or genes within a 
single family and thus, their teachings cannot support a general conclusion regarding correlation 
between gene amplification and mRNA or protein levels. 

Actually, the cited reference Haynes et al, showed that " there was a general trend. 
although no strong correlation between protein [expression] and transcript levels." (see Figure 1 
and page 1863, paragraph 2.1, last line). Therefore, when the proper legal standard is used, 
Haynes clearly supports the Appellants* position. This is all that's needed to meet the "more 
likely than not" evidentiary standard. Again, accurate prediction is not the standard . Therefore, 
a prima facie case of lack of utility has not been met based on the cited references Pennica et al, 
Konopka et al and Haynes et al 

Further, Appellants respectfiilly submit that, contrary to the Examiner's assertion, the 
cited Hu et al reference does not conclusively establish a prima facie case for lack of utility for 
the PROl 112 molecule. The Hu et al reference is entitled "Analysis of Genomic and Proteomic 
Data using Advanced Literature Mining" (emphasis added). Therefore, as the title itself 
suggests, the conclusions in this reference are based upon statistical analysis of information 
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obtained from published literature, and not from experimental data. Hu et ai performed 
statistical analysis to provide evidence for a relationship between mRNA expression and 
biological function of a given molecule (as in disease). The conclusions of Hu et al however, 
only apply to a specific type of breast tumor (estrogen receptor (ER)-positive breast tumor) and 
cannot be generalized to breast cancer genes in general, let alone to cancer genes in general. 
Interestingly, the observed correlation was only found among ER-positive (breast) tumors not 
ER-negative tumors." (See page 412, left column). 

Moreover, the analytical methods utilized by Hu et al have certain statistical drawbacks, 
as the authors themselves admit. For instance, according to Hu et al,, ''different statistical 
methods'' were applied to ''estimate the strength of gene-disease relationships and evaluated the 
resuUs." (See page 406, left column, emphasis added). Using these different statistical methods, 
Hu et al "[ajssessed the relative strengths of gene-disease relationships based on the frequency 
of both co-citation and single citation." (See page 411, left column). As is well known in the 
art, different statistical methods allow different variables to be manipulated to affect the resulting 
outcome. In this regard, the authors disclose that, "Initial attempts to search the literature " using 
the list of genes, gene names, gene symbols, and frequently used synonyms generated by the 
authors "revealed several sources of false positives and false negatives." (See page 406, right 
column). The authors add that the false positives caused by "duplicative and unrelated meanings 
for the term" were "difficult to manage." Therefore, in order to minimize such false positives, 
Hu et al. disclose that these terms "had to be eliminated entirely, thereby reducing the false 
positive rate but unavoidably under-representing some genes." Id. (Emphasis added). Hence, 
Hu et al had to manipulate certain aspects of the input data, in order to generate, in their opinion, 
meaningfiil results. Further, because the frequency of citation for a given molecule and its 
relationship to disease only reflects the current research interest of a molecule, and not the true 
biological function of the molecule, as the authors themselves acknowledge, the "[rjelationship 
established by frequency of co-citation do not necessarily represent a true biological link." (See 
page 411, right column). Therefore, based on these findings, the authors add, "[t]his may reflect 
a bias in the literature to study the more prevalent type of tumor in the population. Furthermore, 
this emphasizes that caution must be taken when interpreting experiments that may contain 
subpopulations that behave very differently." Id. (Emphasis added). In other words, some 
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molecules may have been underrepresented merely because they were less frequently cited or 
studied in literature compared to other more well-cited or studied genes. Therefore, Hu et al 's 
conclusions are not based on genes/mRNA in general. 

Therefore, Appellants submit that, based on the nature of the statistical analysis 
performed herein, and in particular, based on Hu's analysis of one class of genes, namely, the 
estrogen receptor (ER)-positive breast tumor genes, the conclusions drawn by the Examiner, 
namely that, "genes displaying a 5-fold change or less (mRNA expression) in tumors compared 
to normal showed no evidence of a correlation between altered gene expression and a known role 
in the disease (in general)" is not reliably supported. 

The Examiner further cites new references by Lian et aL^ Fessler et aL and Chen et al in 
support of her interpretation that "protein levels cannot be accurately predicted from the level of 
the corresponding mRNA transcript." (Page 5 and 6 of the instant Final Office Action mailed 
August 15,2005). 

Appellants respectfully submit that Lian et al. only teach that protein expression may not 
correlate mRNA level in differentiating myeloid cells and does not teach anything of such a lack 
of correlation for genes in general . In fact, the authors themselves admit that there were a 
number of problems with their data . For instance, at page 520 of this article, the authors 
explicitly express their concerns regarding the methods they utilized and the interpretation of 
their data stating that " [t]hese data must be considered with several caveats: membrane and other 
hydrophobic proteins and very basic proteins are not well displayed by the standard 2DE 
approach, and proteins presented at low level will be missed. In addition, to simplify MS 
anlysis, we used a Coomassie dye stain rather than silver to visualize proteins, and this decreased 
the sensitivity of detection of minor proteins. " (Emphasis added). Appellants submit, as is well- 
known in the art, the Coomassie dye staining method is a very insensitive method of measuring 
protein. Therefore, the conclusions based on such measurements would hardly be considered 
accurate by the skilled artisan, or at least, would not be extrapolated to generally reflect the gene: 
mRNA/ protein relationships for proteins in general. Therefore, even if the teachings of Lian et 
al reflects a lack of correlation between the genes and mRNA/ proteins in differentiating 
myeloid cells (which Appellants submit is not a representative sample of genes in general since 
only certain genes are expressed during differentiation), their conclusions are based on a widely 
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accepted, insensitive method for protein staining, namely, the Coomassie dye staining, which 
cannot be appUed to genes in general. 

Similarly, in Fessler et al. Appellants submit that the PTO has overlooked a number of 
limitations in their, which the authors themselves acknowledge. For instance, Fessler et al only 
examined lipopoysaccharide-activated neutrophilins, so, as with Lian et al, only examined the 
expression level of a few proteins/RNAs in response to LPS stimulation. Fessler et al also 
concede that, since they used the Coomassie Blue dye staining method, which is known to have a 
limited protein binding range and a non-linear curve for protein detection, the resulting image 
analysis of the Coomassie Blue-stained proteins ought to be considered as semi-quantitative only 
(see page 31301, col. 1). Further, Fessler et al submit that protein identification in their study 
was done using two-dimensional PAGE but admit that the analysis was limited only to well- 
resolved regions of the geU which Fessler et al exphcitly concede, tends to select for more 
abundant protein species and therefore, may have performed less well with hydrophobic and high 
molecular weight proteins (see page 31301, col. 1). In addition, the Fessler et al paper also 
indicates that the harvesting of the LPS-incubated PMNs at 4 hours may have prevented the 
detection of early, transiently appearing proteins, and that the process of post-LPS incubation 
and pre-two-dimensional PAGE cell washes would expectedly remove secreted proteins from 
further analysis, perhaps contributing to the observed transcript-protein discordance. Therefore, 
the Fessler et al reference explains the reasons for their transcript-protein discordance and like 
the Lian et al reference, cannot be relied upon to make a general proposition that protein levels 
cannot be accurately predicted from mRNA levels. 

In addition, as discussed above with Fessler et aL, Chen et al also concede that there are 
problems with 2D gel protein detection and therefore, cannot accurately predict protein levels. 
For instance, Chen et al says that, "(i)t is apparent that without prior enrichment only a relatively 
small and highly selected population of long-lived, highly expressed proteins is observed. There 
are many more proteins in a given cell which are not visualized by such methods. Frequently it 
is the low abundance proteins that execute key regulatory fiinctions " (page 1870, col. 1). Thus, 
Chen et al, concede that by selecting proteins visualized by 2D gels, they are likely to have 
excluded in their analysis many key regulatory proteins which could be candidate cancer 
markers. 
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However, the analysis provided by Chen et al in fact support the Appellants general 
proposition that, even if protein levels cannot be accurately predicted (which is not required by 
the utility standard), in the majority of the proteins studied, it is most likely than not that an 
increase in gene amplification or mRNA levels generally correlates well with increased protein 
levels. A review of the correlation coefficient data presented in the Chen et aL paper indicates 
that, for instance, in Table 1 , which lists 66 genes [the paper incorrectly states there are 69 genes 
listed] for which only one protein isoform is expressed, shows that 40 genes out of 66 had a 
positive correlation between mRNA expression and protein expression . This data clearly meets 
the standard for "more likely than not". Similarly, in Table II , 30 genes with multiple isoforms 
[again the paper incorrectly states there are 29] were presented. In this case, in 22 genes out of 
30, at least one isoform showed a positive correlation between mRNA expression and protein 
expression. Furthermore, 12 genes out of 29 showed a strong positive correlation [as determined 
by the authors] for at least one isoform. No genes showed a significant negative correlation. 
Thus, Table II of Chen et al also provides that it is more likely than not that protein levels will 
correlate with mPUSfA expression levels. In fact, the same authors in Chen et al, published a 
latter paper (Beer et al. Nature Medicine 8(8) 816-824 (2002)- not enclosed) which described 
the expression of genes in adenocarcinomas as compared to protein expression. They observed 
that "these results suggest that the oligonucleotide microarrays provided reliable measures of 
gene expression" (pg 317) and further stated that "these studies indicate that many of the genes 
identified using gene expression profiles are likely relevant to lung adenocarcinoma." Therefore, 
the authors of the Chen paper clearly agreed that microarrays provided a rehable measure of the 
expression levels of the gene and could be used to identify genes whose overexpression is 
associated with tumors. 

In summary, Hu et al, Lian et al and Fessler et al do not conclusively teach that, in 
general, protein levels cannot be accurately predicted fi*om mRNA/ gene ampUfication levels. 
These authors concede that either due to insensitive protein detection methods or due their 
methodologies utilized in their protocols, some protein species may have been underrepresented 
over others. Therefore, the teachings of these references cannot be relied upon to establish a 
prima facie showing of lack of utility. On the other hand, as noted even in Haynes et al and 
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Chen et aL, most genes showed a positive correlation between increased gene ampUfication, 
mRNA and translated protein. 

Appellants once again remind the Examiner that only after the Examiner has made a 
proper prima facie showing of lack of utility, does the burden of rebuttal shift to the Appellant. 
Based on the above discussions, such a showing has not been made in this instance. 
Accordingly, the instant rejection should be withdrawn for the Examiner's lack of establishment 
of a prima facie showing. 

D. The Gene Amplification Data Establishes Credible, Substantial and Specific 
Patentable Utility for the PR01112 Polypeptide and its antibodies 

In fact, as discussed throughout prosecution, Appellants submit that Example 170 of the 
specification fiirther discloses that, "(a)mplification is associated with overexpression of the gene 
product, indicating that the polypeptides are usefiil targets for therapeutic intervention in certain 
cancers such as lung, colon, breast and other cancers and diagnostic determination of the 
presence of those cancers" (Emphasis added). Appellants have also submitted ample evidence to 
show that, in general, if a gene is amplified in cancer, it is "more likely than nof that the 
encoded protein will also be expressed at an elevated level. 

For instance. Appellants presented the articles by Omtoft et al., Hyman et al, and 
Pollack et al (made of record in Appellants* Response filed June 25, 2004), who collectively 
teach that in general for most genes, DNA amplification increases mRNA expression . The 
results presented by Omtoft et al,, Hyman et al, and Pollack et al are based upon wide ranging 
analyses of a large number of tumor associated genes. Omtoft et al studied transcript levels of 
5600 genes in malignant bladder cancers, many of which were linked to the gain or loss of 
chromosomal material, and found that in general (18 of 23 cases) chromosomal areas with more 
than 2-fold gain of DNA showed a corresponding increase in mRNA transcripts. Hyman et al 
compared DNA copy numbers and mRNA expression of over 12,000 genes in breast cancer 
tumors and cell lines, and found that there was evidence of a prominent global influence of copy 
number changes on gene expression levels, hi Pollack et al, the authors profiled DNA copy 
number alteration across 6,691 mapped human genes in 44 predominantly advanced primary 
breast tumors and 10 breast cancer cell lines, and found that on average, a 2-fold change in DNA 
copy number was associated with a corresponding 1 .5-fold change in mRNA levels. In 
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summary, the evidence supports the Appellants* position that gene amplification is more likely 
than not predictive of increased mRNA and polypeptide levels. 

Second, the Declaration of Dr. Paul Polakis (made of record in Appellants* Response 
filed June 25, 2004), principal investigator of the Tumor Antigen Project of Genentech, Inc., the 
assignee of the present application, explains that in the course of Dr. Polakis* research using 
microarray analysis, he and his co-workers identified approximately 200 gene transcripts that are 
present in human tumor cells at significantly higher levels than in corresponding normal human 
cells. Appellants submit that Dr. Polakis' Declaration was presented to support the position that 
there is a correlation between mRNA levels and polypeptide levels, the correlation between gene 
amplification and mRNA levels having already been established by the data shown in the Omtoft 
et al, Hyman et al, and Pollack et aL articles. Appellants further emphasize that the opinions 
expressed in the Polakis Declaration, including in the above quoted statement, are all based on 
factual findings. For instance, antibodies binding to about 30 of these tumor antigens were 
prepared, and mRNA and protein levels were compared. In approximately 80% of the cases , the 
researchers found that increases in the level of a particular mRNA correlated with changes in the 
level of protein expressed from that mRNA when human tumor cells are compared with their 
corresponding normal cells . Therefore, Dr. Polakis' research, which is referenced in his 
Declaration, shows that, in general there is a correlation between increased mRNA and 
polypeptide levels . Hence, one of skill in the art would reasonably expect that, based on the gene 
amplification data of the PROl 112 gene, the PROl 112 polypeptide is concomitantly 
overexpressed in the colon tumors studied as well. 

Appellants further note that the sale of gene expression chips to measure mRNA levels is 
a highly successful business, with a company such as Affymetrix recording 168.3 million dollars 
in sales of their GeneChip® arrays in 2004. Clearly, the research community believe that the 
information obtained from these chips is useful (/.e., that it is more likely than not that the results 
are informative of protein levels). 

The Examiner appears to disregard the ample evidence provided in the above referenced 
articles based on misinterpretations of their teachings. The "more likely than not" standard is a 
much lower standard than a "necessary" correlation or "accurate" prediction, and Appellants 
submit that in fact, this standard is clearly met by the instant disclosure, and furthermore, the 



-20- 



On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 09/992,643 
Attorney's Docket No. 39780-2730 P1C13 



Declarations and the articles by the Appellants lend significant support to the fact that for an 
amplified gene, it is more likely than not that the protein will also be overexpressed. Moreover, 
the Examiner's cited references do not present a prima facie case of lack of utility, as discussed 
above. 

The Examiner also argues, based on the Hittelman reference, that art recognizes 
that lung epithelium is at risk for cellular damage due to direct exposure to environmental 
pollutants and carcinogens, which result in aneuploidy before the epithelial cells turn 
cancerous. . .Hittelman teach that damaged, precancerous lung epithelium is often aneuploid." 

Appellants submit that, even if the amplification of the PROl 112 gene were due to 
chromosomal aneuploidy (which Appellants expressly do not concede to), the art exemplified by 
the Hittelman et ah reference still supports the Appellants' position for utility, because there is 
utility for an aneuploid gene at least as a marker for cancer or precancerous cells or damaged 
tissue . For example, Hittelman, who studied premalignant lung lesions, suggests that epithelial 
tumors develop through a multistep process driven by genetic instability (see Hittelman abstract). 
Hittelman showed that the same subset of molecular changes found in associated tumor were 
also found in premalignant lesions, suggesting that these premalignant lesions might represent 
precursor lesions for associated tumors. In other words, Hittelman suggests that cancer is a 
manifestation of a multistep tumorigenesis process (see Hittelman, page 4, last three lines). 
Therefore, contrary to the Examiner's interpretation, the Hittelman reference strongly supports 
the Appellants position in that the cited art makes clear that there is utility in identifying genetic 
biomarkers in epithelial tissues at cancer risk . For example, Hittelman clearly says that "it is 
important to identify individuals at significantly increased cancer risk who might best benefit 
fi"om different types of intervention" (see page 2, fourth paragraph, line 3 and also the abstract, 
line 4-7 of Hittelman). Therefore, even if Appellants were to show that the observed PROl 112 
gene amplification were due to chromosomal aneuploidy (which Appellants do not contend to), 
identification of such a genetic biomarkers is a very important and usefiil step, according to 
Hittelman, in identifying individuals at significantly increased cancer risk. In fact, one skilled in 
the art would find it entirely reasonable fi*om Hittelman that early detection of lung cancer would 
provide vital, advance information regarding risk assessment, prognosis and therapy for lung 
cancer. Accordingly, the PRO 1112 polynucleotides and the polypeptides that they encode, find 
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utility at least as a diagnostic marker for individuals at risk of developing lung cancer, for the 
reasons discussed above. Thus, a prima facie case for lack of utility has not been made based on 
the Hittelman reference. 

Thus, based on the asserted utility for PROl 1 12 in the diagnosis of lung or colon tumors, 
the reduction to practice of the instantly claimed protein sequence of SEQ ID NO: 207 in the 
present application, the disclosure of the step-by-step protocols for making chimeric PRO 
polypeptides, including those w^herein the heterologous polypeptide is an epitope tag or an Fc 
region of an immunoglobulin in the specification (at page 374, lines 24 to page 375, line 9), the 
disclosure of a step-by-step protocol for making and expressing PROl 1 12 in appropriate host 
cells (in Examples 140-143 and page 376, line 12), the step-by-step protocol for the preparation, 
isolation and detection of monoclonal, polyclonal and other types of antibodies against the 
PROl 1 12 protein in the specification (at pages 390-395) and the disclosure of the gene 
amplification assay in Example 170, the skilled artisan would know exactly how to make and use 
the claimed polypeptide and its antibodies for the diagnosis of lung or colon cancers. Appellants 
submit that based on the detailed information presented in the specification and the advanced 
state of the art in oncology, the skilled artisan would have found such testing routine and not 
'undue.' 

Therefore, Appellants respectfully request reconsideration and reversal of this 
outstanding rejections under 35 U.S.C. §101 and §112, first paragraph, to Claims 122-126 and 
129-131. 

ISSUE 2: Claims 119-123. 130 and 131 Satisfy the Written Description Requirement of 35 
U.S.C. $112. First Paragraph 

Claims 1 19-123, 130 and 31 stand rejected under 35 U.S.C. §112, first paragraph, as 

allegedly containing "subject matter which was not described in the specification in such a way 

as to reasonably convey to one skilled in the relevant art that the inventor(s), at the time the 

application was filed, had possession of the claimed invention." In particular, the Examiner has 

asserted that the specification does not (disclose) any variants of SEQ ID NO: 207, naturally 

occurring or not, nor whether such sequences are amplified in colon tumors" (Page 14, Final 

Office Action mailed August 15, 2005). The Examiner also cites Fiers v. Revel, 25 USPQ2d 

1601 (Fed. Cir. 1993) and Fiddes v. Baird, 30 USPQ2d 1481, 1483 (1993) to show that . 
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Appellants respectfully disagree. 



A. The Legal Test for Written Description 

The well-established test for sufficiency of support under the written description 
requirement of 35 U.S.C. §112, first paragraph is "whether the disclosure of the application as 
originally filed reasonably conveys to the artisan that the inventor had possession at that time of 
the later claimed subject matter, rather than the presence or absence of literal support in the 

18 19 

specification for the claim language." ' The adequacy of written description support is a 

20 

factual issue and is to be determined on a case-by-case basis. The factual determination in a 
written description analysis depends on the nature of the invention and the amount of knowledge 

21 22 

imparted to those skilled in the art by the disclosure. ' 

23 

In Environmental Designs, Ltd. v. Union Oil Co,, the Federal Circuit held, "Factors that 
may be considered in determining level of ordinary skill in the art include: (1) the educational 
level of the inventor; (2) type of problems encountered in the art; (3) prior art solutions to those 
problems; (4) rapidity with which innovations are made; (5) sophistication of the technology; 

24 

and (6) educational level of active workers in the field." Further, the "hypothetical 'person 
having ordinary skill in the art* to which the claimed subject matter pertains would, of necessity 



In reKaslow, 707 F.2d 1366, 1374, 212 U.S.P.Q. 1089, 1096 (Fed. Cir. 1983). 

See also Vas-Cath, Inc, v. Mahurkar, 935 F.2d at 1563, 19 U.S.P.Q.2d at 1 1 16 (Fed. 
Cir. 1991). 

See e.g., Vas-Cath, 935 F.2d at 1563; 19 U.S.P.Q.2d at 1116. 

Union Oil v. Atlantic Richfield Co., 208 F.2d 989, 996 (Fed. Cir. 2000). 

See also M.P.E.P. §2163 11(A). 

713 F.2d 693, 696, 218 U.S.P.Q. 865, 868 (Fed. Cir. 1983), cert, denied, 464 U.S. 
1043 (1984). 

See also M.P.E,P. §2141.03. 
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have the capabiUty of understanding the scientific and engineering principles applicable to the 

25 26 

pertinent art ," * 

B. The Disclosure Provides Sufficient Written Description for the Claimed 
Invention 

Appellants respectfully submit that the instant specification evidences the actual 
reduction to practice of the native amino acid sequence of SEQ ID NO: 207. Support for "native 
sequences can be found in the instant specification at least at page 304, line 26. Appellants also 
submit that the specification provides ample written support for determining percent sequence 
identity between two amino acid sequences (See pages 306-308, line 14 onwards). In fact, the 
specification teaches specific parameters to be associated with the term "percent identity" as 
applied to the present invention. The specification further provides detailed guidance as to 
changes that may be made to a PRO polypeptide without adversely affecting its activity 
(page 372, line 36 to page 373, line 17). This guidance includes a listing of exemplary and 
preferred substitutions for each of the twenty naturally occurring amino acids (Table 6, 
page 372). Accordingly, one of skill in the art could identify whether the variant PROl 112 
sequence falls within the parameters of the claimed invention. Once such an amino acid 
sequence was identified, the specification sets forth methods for making the amino acid 
sequences (see page 376, line 9) and methods of preparing the PRO polypeptides (see 
Examples 140-143). 

Currently pending Claims 1 19-123 and 130-131 recite the functional recitation that the 
nucleic acid encoding the claimed polypeptides are amplified in colon tumors. Appellants 
further submit that the specification provides ample written support for detecting and quantifying 
amplification of such nucleic acids in several tumors and/or cell lines as described in 
Example 170. Example 170 of the present application provides step-by-step guidelines and 
protocols for the gene amplification assay. By following this disclosure, one skilled in the art 



Ex parte Hiyamizu, 10 U.S.P.Q.2d 1393, 1394 (Bd. Pat. App. & Inter. 1988) (emphasis 

added). 

See ^/^oM.P.E.P. §2141.03. 
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would know that it is easy to test whether a gene encoding a variant PROl 1 12 protein is 
amplified in colon tumors by the methods set forth in Example 170. 

Appellants refer to the arguments and information presented above in response to the 
outstanding rejections under 35 U.S.C. §101 and 35 U.S.C. §112, first paragraph, for alleged lack 
of utility and enablement. These arguments are incorporated by reference herein. Appellants 
respectfully submit that as discussed above under Issue I, the teachings in the art, as exemplified 
by Omtoft et al, Hyman et al. Pollack et aL, and the Polakis Declaration, overwhelmingly show 
that gene amplification influences gene expression at the mRNA and protein levels. Thus, the 
amplification of the encoding polynucleotide in tumors does provide usefiil information 
regarding the ftinctional property of the polypeptide in being overexpressed in tumor tissues. 

Appellants fiirther respectfully submit that whether or not the polypeptide is also 
overexpressed in tumor tissues is irrelevant to the consideration of adequate written description. 
The claims have characterized the recited polypeptides as having the property that their encoding 
polynucleotides are amplified in colon tumors. As discussed above, the specification describes 
methods for identifying genes which are ampHfied in colon tumors. Therefore, one of skill in the 
art could readily test a nucleic acid sequence which encodes a variant polypeptide to determine 
whether it is amplified by the methods set forth in Example 170. Thus, the recited property of 
arhplification of the encoding gene adds to the characterization of the claimed polypeptide 
sequences in a manner that one of skill in the art could readily assess and understand. 

Applicants respectfiilly submit that the issue is whether the claimed sequences are 
adequately described. Appellants respectfully submit that the fiill-length sequence of PROl 1 12 
is clearly provided in the specification. The Examiner has acknowledged that the specification 
discloses the fiill-length sequence of SEQ ID NO: 207 and that polypeptides comprising the 
sequence set forth in SEQ ID NO: 207 meet the written description provision of 35 U.S.C. §112, 
first paragraph. 

Secondly, Appellants submit that the court in Fiers v. Revel held that "[i]f a conception of 
a DNA requires a precise definition, such as by structure, formula, chemical name, or physical 
properties, as we have held, then a description also requires that degree of specificity." Fiers, 
984 F.2d at 1 171 . Since the instant claims are directed to polypeptides , Fiers is distinguished on 
the facts and does not apply. Similarly, the Fiddes decision holds that naturally-occurring gene 

-25- 

On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 09/992,643 
Attorney's Docket No. 39780-2730 P1C13 



sequences cannot be patented unless the actual sequence is clearly disclosed in the patent 
application. Again, since the instant claims are directed to polypeptides , Fiddes is distinguished 
on the facts and does not apply. 

In Fiddes v. Baird, the Board of Patent Appeals and Interferences held that party Fiddes' 
claims to a human gene for basic fibroblast growth factor were separately patentable over party 
Baird's issued and pending claims specifying a sequence encoding "mammalian" basic fibroblast 
growth factor. Party Baird's issued patent disclosed the amino acid sequence that had been 
isolated from bovine pituitary and a theoretical DNA sequence encoding it. Party Baird's 
pending claims were from a continuation-in-part appUcation disclosing the naturally-occurring 
coding sequence for bovine fibroblast growth factor. Between the filing of the first application 
and the continuation-in-part application, DNA sequences anticipating claims to the naturally- 
occurring human fibroblast growth factor were published. Party Baird argued that the published 
sequence could not be used as prior art because it was entitled to the filing date of the first 
application. 

The Board held that party Baird was not entitled to the filing date of the first application 
of its claims to the mammalian DNA sequence because it did not set out specific DNA sequences 
of naturally-occurring mammalian genes in the first application. Therefore, the first apphcaUon 
did not meet the "written description" for the specific DNA sequences of naturally-occurring 
mammalian genes. The Board added that party Baird was not in possession of the naturally 
occurring bovine gene at the time of filing the first application even though its encoded amino 
acid sequence was known. 

Appellants respectfully submit that in the present application, Appellants have clearly 
disclosed the fiill-length sequence of SEQ ID NO: 207 and its encoding nucleic acid sequence 
SEQ ID NO: 1 93. In addifion, one of skill in the art could identify whether the variant PROl 1 12 
sequence falls within the parameters of the claimed invention. Furthermore, as mentioned above, 
since the instant claims are directed to polypeptides , Fiddes is distinguished on the facts and does 
not apply. Accordingly, the Examiner's assertion that Appellants provide no written description 
in the specification for any other species of PROl 112 molecules is based on the incorrect 
interpretation of the holding in the Fiddes case. 
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More recently, in Enzo Biochem., Inc. v, Genprobe, Inc. 296 F.3d 1316 (Fed. Cir. 2002), 
the court adopted the standard that "the written description requirement can be met by 'showing 
that the invention is complete by disclosure of sufficiently detailed, relevant identifying 
characteristics, . , .i.e., complete or partial structure, other physical and/or chemical properties, 
functional characteristics when coupled with a known or disclosed correlation between function 
and structure, or some combination of such characteristics." Id. at 1324. While the invention in 
Enzo was still a DNA, the holding has been treated as being applicable to proteins as well. 
Indeed, the court adopted the standard from the USPTO's Written Description Examination 
Guidelines, which apply to both proteins and nucleic acids. 

Accordingly, current applicable case law holds that biological sequences are not 
adequately described solely by a description of their desired functional activities. The instant 
claims meet the standard set by the Enzo court in that the claimed sequences are defined not only 
by functional properties, but also by structural limitations. It is well established that a 
combination of functional and structural features may suffice to describe a claimed genus. "An 
applicant may also show that an invention is complete by disclosure of sufficiently detailed, 
relevant identifying characteristics which provide evidence that applicant was in possession of 
the claimed invention, i.e., complete or partial structure, other physical and/or chemical 
properties, functional characteristics when coupled with a known or disclosed correlation 

. . 27 

between function and structure, or some combination of such charactenstics." Thus, the genus 
of polypeptides with at least 80-99% sequence identity to SEQ ED NO: 207, which possess the 
functional property of having a nucleic acid which is amplified in colon tumor would meet the 
requirement of 35 U.S.C. §112, first paragraph, as providing adequate written description. 
Accordingly, one skilled in the art would have known that Appellants had knowledge and 
possessed the claimed polypeptides with 80-99% sequence identity to SEQ ID NO: 207 whose 
encoding nucleic acids were amplified in colon tumors. The recited property of amplification of 
the encoding gene adds to the characterization of the claimed polypeptide sequences in a manner 
that one of skill in the art could readily assess and understand. As discussed above, Appellants 



27 



M.P.E.P. §2163 11(A)(3)(a) 
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have recited structural features, namely, 80-99% sequence identity to SEQ ED NO: 207, which 
are common to the genus. Appellants have also provided guidance as to how to make the recited 
variants of SEQ ID NO: 207, including listings of exemplary and preferred sequence 
substitutions. The genus of claimed polypeptides is further defined by having a specific 
functional activity for the encoding nucleic acids. Accordingly, a description of the claimed 
genus has been achieved. 

Accordingly, Appellants respectfiilly request reconsideration and reversal of the written 
description rejection of Claims 119-123 and 130-131 under 35 U.S.C. §112, first paragraph. 



For the reasons given above. Appellants submit that present specification clearly 
describes, details and provides a patentable utility for the claimed invention. Moreover, it is 
respectfully submitted that based upon this disclosed patentable utility, the present specification 
clearly teaches "how to use" the presently claimed polypeptide. As such. Appellants respectfiilly 
request reconsideration and reversal of the outstanding rejection of Claims 1 19-126 and 129-131. 

The Commissioner is authorized to charge any fees which may be required, including 
extension fees, or credit any overpayment to Deposit Account No. 08-1641 (referencing 
Attorney's Docket No. 39780-2730 P1C13) . 



HELLER EHRMAN LLP 

275 Middlefield Road 
Menlo Park, California 94025-3506 
Telephone: (650) 324-7000 
Facsimile: (650) 324-0638 



CONCLUSION 



Respectfully submitted, 



Date: February 13, 2006 




Ginger R. Dreger (Reg. No. 33,055) 



-28- 



On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 09/992,643 
Attorney's Docket No. 39780-2730 P1C13 



VIII. CLAIMS APPENDIX 



Claims on Appeal 



119. An isolated native sequence polypeptide having at least 80% amino acid sequence 
identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:207; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:207, lacking its 
associated signal peptide; or 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 209951; 

wherein, the nucleic acid encoding said polypeptide is amplified in lung or colon tumors. 

120. The isolated native sequence polypeptide of claim 119 having at least 85% amino 
acid sequence identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:207; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:207, lacking its 
associated signal peptide; or 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 209951; 

wherein, the nucleic acid encoding said polypeptide is amplified in lung or colon tumors. 

121 . The isolated native sequence polypeptide of claim 119 having at least 90% amino 
acid sequence identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:207; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:207, lacking its 
associated signal peptide; or 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 209951; 

wherein, the nucleic acid encoding said polypeptide is amphfied in lung or colon tumors. 
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122. The isolated native sequence polypeptide of claim 119 having at least 95% amino 
acid sequence identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:207; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:207, lacking its 
associated signal peptide; or 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 209951; 

wherein, the nucleic acid encoding said polypeptide is amplified in lung or colon tumors. 

123. The isolated native sequence polypeptide of claim 119 having at least 99% amino 
acid sequence identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:207; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:207, lacking its 
associated signal peptide; or 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 209951; 

wherein, the nucleic acid encoding said polypeptide is amplified in lung or colon tumors. 

124. An isolated polypeptide comprising: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO:207; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO:207, lacking its 
associated signal peptide; or 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 209951; 

wherein, the nucleic acid encoding said polypeptide is amplified in lung or colon tumors. 

125. The isolated polypeptide of Claim 124 comprising the amino acid sequence of the 
polypeptide of SEQ ID NO:207. 

126. The isolated polypeptide of Claim 124 comprising the amino acid sequence of the 
polypeptide of SEQ ID NO:207, lacking its associated signal peptide. 

-30- 

On Appeal to the Board of Patent Appeals and Interferences 

Appellants' Brief 
Application Serial No. 09/992,643 
Attorney's Docket No. 39780-2730 P1C13 



129. The isolated polypeptide of Claim 124 comprising the amino acid sequence of the 
polypeptide encoded by the full-length coding sequence of the cDNA deposited under ATCC 
accession number 20995 1 . 

130. A chimeric polypeptide comprising a polypeptide according to Claim 124 fused to 
a heterologous polypeptide. 

131. The chimeric polypeptide of Claim 130, wherein said heterologous polypeptide is 
an epitope tag or an Fc region of an immunoglobulin. 
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IX. EVIDENCE APPENDIX 

1. Declaration of Paul Polakis, Ph.D. under 35 C.F.R. §1.132. 

2. Declaration of Avi Ashkenazi, Ph.D. under 35 C.F.R. §1.132, with attached 
Exhibit A (Curriculum Vitae). 

3. Declaration of Audrey Goddard, Ph.D. under 35 C.F.R. §1.132, with attached 
Exhibits A-G: 

A. Curriculum Vitae of Audrey D. Goddard, Ph.D. 

B. Higuchi, R. et al., "Simultaneous amplification and detection of specific 
DNA sequences," Biotechnology 10:413-417 (1992). 

C. Livak, K.J., et al, "OUgonucleotides with fluorescent dyes at opposite 
ends provide a quenched probe system useful for detecting PCR product 
and nucleic acid hybridization," PCR Methods AppL 4:357-362 (1995). 

D. Heid, C.A, et al., "Real time quantitative PCR," Genome Res, 6:986-994 
(1996). 

E. Pennica, D. et al., "WISP genes are members of the connective tissue 
growth factor family that are up-regulated in Wnt-1 -transformed cells and 
aberrantly expressed in human colon tumors," Proc. Natl. Acad. ScL USA 
95:14717-14722(1998). 

F. Pitti, R.M. et al., "Genomic amplification of a decoy receptor for Fas 
Hgand in lung and colon cancer," Nature 396:699-703 (1998). 

G. Bieche, I. et al., "Novel approach to quantitative polymerase chain 
reaction using real-time detection: Application to the detection of gene 
amplification in breast cancer," Int, J. Cancer 78:661-666 (1998). 

4. Omtoft, T.F., et al, "Genome-wide Study of Gene Copy Numbers, Transcripts, 
and Protein Levels in Pairs of Non-Invasive and Invasive Human Transitional Cell Carcinomas," 
Molecular c& Cellular Proteomics 1:37-45 (2002). 

5. Hyman, E., et al., "Impact of DNA Amplification on Gene Expression Pattems in 
Breast Cancer," Cancer Research 62:6240-6245 (2002). 

6. Pollack, J.R., et al., "Microarray Analysis Reveals a Major Direct Role of DNA 
Copy Number Alteration in the Transcriptional Program of Human Breast Tumors," Proc. Natl. 
Acad Sci. USA 99:12963-12968 (2002). 

7. Hanna et al., "HER-2/neu Breast Cancer Predictive Testing," Pathology 
Associates Medical Laboratories (1999). 
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8. Pennica, D. et aL, "WISP genes are members of the comiective tissue growth 
factor family that are up-regulated in Wnt-1 -transformed cells and aberrantly expressed in human 
colon tumors/' Proc. Natl Acad. ScL USA 95:14717-14722 (1998). 

9. Konopka et ai, "Variable Expression of the Translocated c-abl oncogene in 
Philadelphia-chromosome-positive B-lymphoid cell lines from chronic myelogenous leukemia 
patients" Proc. Natl. Acad. Sci. USA 83: 4049-52, (1986). 

10. Haynes et aL, "Proteome analysis: Biological assay or data archive?" 
Electrophoresis 19:1862-1871 (1996). 

1 1 . Lian et aL, "Genomic and proteomic analysis of the myeloid differentiation 
program," 5 W 98: 513-524 (2001). 

12. Fessler et aL, "A genomic and proteomic analysis of activation of the human 
neutrophil by lipopolysaccharide and its mediation by p38 mitogen-activated protein kinase," J. 
Biol Chem. Ill: 31291-31302 (2002). 

13. Hu aL, "Analysis of genomic and proteomic data using advanced literature 
mining," J. Proteome Res. 2: 405-412 (2003). 

14. Chen et aL, "Discordant Protein and mRNA Expression in Lung Adeno- 
carcinomas," Mo/. Cellular Proteomics, 1: 304-313 (2002). 

15. Hittelman et aL, Ann. NY. Acad ScL 952: 1-12 (2001). 

Items 1-2 and 4-7 were submitted with Appellants' Response filed June 25, 2004, and were 
considered by the Examiner as indicated in the Final Office Action mailed September 17, 2004. 

Item 3 was submitted with Appellants' Response filed August 4, 2005, and was considered by 
the Examiner as indicated in the Final Office Action mailed August 15, 2005. 

Items 8-10 were made of record by the Examiner in the Office Action mailed February 25, 2004. 

Item 13 was made of record by the Examiner in the Final Office Action mailed 
September 17, 2004. 

Items 11-12 and 14-15 were made of record by the Examiner in the Final Office Action mailed 
August 15, 2005. 
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X. RELATED PROCEEDINGS APPENDIX 



None. 



SV 2185164 vl 

2/12/06 2:10 PM (39780.2730) 



-34- 

Appeal Brief 
Application Serial No. 09/993,687 
Attorney's Docket No. 39780-2730 PlCll 




DECLARATION OF PAUL POLAKIS, Ph.D. 
I, Paul Polakis, Ph.D., declare and say as follows: 

1 . I was awarded a Ph.D. by the Department of Biochemistry of the Michigan 
State University in 1984. My scientific Curriculum Vitae is attached to and forms 
part of this Declaration (Exhibit A). 

2. I am currently employed by Genentech, Inc. where my job title is Staff 
Scientist. Since joining Genentech in 1999, one of my primary responsibilities has 
been leading Genentech's Tumor Antigen Project, which is a large research project 
with a primary focus on identifying tumor cell markers that fmd use as targets for 
both the diagnosis and treatment of cancer in humans. 

3. As part of the Tumor Antigen Project, my laboratory has been analyzing 
differential expression of various genes in tumor cells relative to normal cells. 
The purpose of this research is to identify proteins that are abundantly expressed 
on certain tumor cells and that are either (i) not expressed, or (ii) expressed at 
lower levels, on corresponding normal cells. We call such differentially expressed 
proteins "tumor antigen proteins", When such a tumor antigen protein is 
identified, one can produce an antibody that recognizes and binds to that protein. 
Such an ^tibody finds use in the diagnosis of human cancer and may ultimately 
serve as an effective therapeutic in the treatment of human cancer. 

4. In the course of the research conducted by Genentech's Tumor Antigen 
Project, we have employed a variety of scientific techniques for detecting and 
studying differential gene expression in human tumor cells relative to normal cells, 
at genomic DNA, mRNA and protein levels. An important example of one such 
technique is the well known and widely used technique of microarray analysis 
which has proven to be extremely useful for the identification of mRNA molecules 
that are differentially expressed in one tissue or cell type relative to another. In the 
course of our research using microarray analysis, we have identified 
approximately 200 gene transcripts that are present in human tumor cells at 

- significantly higher levels-than in corresponding-normal human cells. To date, — 
have generated antibodies that bind to about 30 of the tumor antigen proteins 
expressed from these differentially expressed gene transcripts and have used these 
antibodies to quantitatively determine the level of production of these tumor 
antigen proteins in both human cancer cells and corresponding normal cells. We 
have then compared the levels of mRNA and protein in both the tumor and normal 
cells analyzed. 

5. From the mRNA and protein expression analyses described in paragr^h 4 
above, we have observed that there is a strong correlation between changes in the 
level of mRNA present in any particular cell type and the level of protein 





expressed from that mRNA in that cell type. In approximately 80% of our 
observations we have found that increases in the level of a particular mRNA 
correlates with changes in the level of protein expressed from that mRNA when 
human tumor cells are compared with their corresponding normal cells. 

6. Based upon my own experience accumulated in more than 20 years of 
research, including the data discussed in paragraphs 4 and 5 above and my 
knowledge of the relevant scientific literature, it is my considered scientific 
opinion that for human genes, an increased level of mRNA in a tumor cell relative 
to a normal cell typically correlates to a similar increase in abundance of the 
encoded protein in the tumor cell relative to the normal cell. In fact, it remains a 
central dogma in molecular biology that increased mRNA levels are predictive of 
corresponding increased levels of the encoded protein. While there have been 
published reports of genes for which such a correlation does not exist, it is my 
opinion that such reports are exceptions to the commonly understood general rule 
that increased mRNA levels are predictive of corresponding increased levels of the 
encoded protein. 

7. I hereby declare that all statements made herein of my own knowledge are 
true and that all statements made on information or belief are believed to be true, 
and further that these statements were made with the knowledge that vkdllful fiilse 
statements and the like so made are punishable by fine or imprisormient, or both, 
under Section 1001 of Title 18 of the United States Code and that such willful 
statements may jeopardize the validity of the application or any patent issued 
thereon. 



Dated: 5/^?A/ 




Paul Polakis, Ph.D. 
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Ashkenazi. A. Control ofTRAIL-induced apoptosis by a family of signaUng and 
decoy receptors. Science 111, 818-821 (1997). 

43. Marsters, S., Sheridan, J., Pitti, R., Gumey, A., Skubatch, M., Balswin, D., Huang, A., 
Yuan, J., Goddard, A., Godowski, P., and Ashkenazi. A. A novel receptor for 
Apo2L/TRAIL contains a truncated death domain. Curr. Biol. 7, 1003-1006 (1997). 

44. Marsters, A., Sheridan, J., Pitti, R., Brush, J., Goddard, A., and Ashkenazi. A. 
Identification of a ligand for the death-domain-contaming receptor Apo3. Curr. Biol. 
8, 525-528 (1998). 

45. Rieger, J., Naumann, U., Glaser, T., Ashkenazi. A ., and Weller, M. Apo2 hgand: 
a novel weapon against malignant glioma? FEB S Lett 427, 124-128 (1998). 

46. Pender, S., Fell, J., Chamow, S., Ashkenazi. A ., and MacDonald, T. A p55 TNF 
receptor immunoadhesin prevents T cell mediated intestinal injury by inhibiting 
matrix metalloproteinase production- /. Immunol. 160, 4098-4103 (1998). 

47. Pitti, R., Marsters, S., Lawrence, D., Roy, Kischkel, F., M., Dowd, P., Huang, A., 
Donahue, C, Sherwood, S., Baldwin, D., Godowski, P., Wood, W., Gumey, A., 
Hillan, K., Cohen, R., Goddard, A., Botstein, D., and Ashkenazi. A. Genomic 
amplification of a decoy receptor for Fas Ugand in lung and colon cancer. Nature 
396,699-703(1998). 

48. Mori, S., Marakami-Mori, K., Nakamura, S., Ashkenazi. A ., and Bonavida, B. 
Sensitization of AIDS Kaposi's sarcoma cells to Apo-2 hgand-induced apoptosis 
by actinomycin D. J. Immunol. 162, 5616-5623 (1999). 

49. Gumey, A. Marsters, S., Huang, A., Pitti, R., Mark, M., Baldwin, D., Gray, A., 
Dowd, P., Bmsh, J., Heldens, S., Schow, P., Goddard, A, Wood, W., Baker, K., 
Godowski, P., and Ashkenazi. A: Identification of a new member of the tumor 
necrosis factor family and its receptor, a human ortholog of mouse GITR. Curr. 
Biol. 9, 215-218 (1999). 
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50. Ashkenazi, A ., Pai, R., Fong, s., Leung, S., Lawrence, D., Marsters, S., Blackie, 
C, Chang, L., McMurtrey, A., Hebert, A., DeForge, L., Khoumenis, I., Lewis, D., 
Harris, L., Bussiere, J., Koeppen, H., Shahrokh, Z., and Schwall, R. Safety and 
anti-tumor activity of recombinant soluble Apo2 ligand. / Clin. Invest. 104, 155- 
162 (1999). 

5 1 . Chuntharapai, A., Grbbs, V., Lu, J., Ow, A., Marsters, S., Ashkenazi, A., De Vos, 
A., Kim, K.J. Determination of residues involved in ligand binding and signal 
transmissiion in the human IFN-a receptor 2. J. Immunol. 163, 766-773 (1999). 

52. Johnsen, A.-C., Haux, J., Steinkjer, B., Nonstad, U., Egeberg, K., Sundan, A., 
Ashkenazi. A., and Espevik, T. Regulation of Apo2L/TRAIL expression in NK 
cells - involvement in NK cell-mediated cytotoxicity. Cytokine 1 1, 664-672 
(1999). 

53. Roth, W., Isenmann, S., Naumann, U., Kugler, S., Bahr, M., Dichgans, 

. Ashkenazi. A., and Weller. M. Eradicationof intracranial human malignant 
gUoma xenografts by Apo2L/TRAIL. Biochem. Biophys. Res. Commun. 265, 479- 
483 (1999). 

54. Hymowitz, S.G., Christinger, H.W., Fuh, G., Ultsch, M., O'Connell, M., Kelley, 
R.F., Ashkenazi. A. and de Vos, A.M. Triggering Cell Death: The Crystal 
Structure of Apo2L/TRAIL in a Complex with Death Receptor 5 . Molec. Cell 4, 
563-571 (1999). 

55. Hymowitz, S.G., O'Connel, M.P.,^Utsch, M.H., Hurst, A., Totpal, K., Ashkenazi, 
A,, de Vos, A.M., Kelley, R.F. A unique zinc-binding site revealed by a high- 
resolution X-ray structure of homotrimeric Apo2L/TRAIL. Biochemistry 39, 633- 
640(2000). 

56. Zhou, Q., Fukushima, P., DeGraff, W., Mitchell, J.B., Stetler-Stevenson, M., 
Ashkenazi.- A., and Steeg, P.S. Radiation and the Apo2L/TRAIL apoptotic 
pathway preferentially inhibit the colonization of premahgnant human breast 
cancer cells overexpressing cyclin D 1 . Cancer Res. 60, 26 1 1 -26 1 5 (2000). 

57. Kischkel, F.C., Lawrence, D. A., Chuntharapai, A., Schow, P., Kim, J ., and 
Ashkenazi. A. Apo2L/TRAIL-dependent recruitment of endogenous FADD and 
Caspase-8 to death receptors 4 and 5. Immunity 12, 61 1-620 (2000). 

58. Yan, M., Marsters, S.A., Grewal, I.S., Wang, H., * Ashkenazi. A. , and *Dixit, 
V.M. Identification of a receptor for BlyS demonstrates a crucial role in humoral 
immunity. Nature Immunol. 1, 37-41 (2000). 
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59. Marsters, S.A., Yan, M., Pitti, R.M., Haas, P.E., Dixit, V.M., and Ashkenazi, A. 
Interaction of the TNF homologues BLyS and APRE. with the TNF receptor 
homologues BCMA and TACI. Curr. Biol. 10, 785-788 (2000). 

60. Kischkel, F.C., and Ashkenazi. A . Combining enhanced metabolic labeling with 
immunoblotting to detect interactions of endogenous cellular proteins. 
Biotechniques 29, 506-512 (2000). 

6 1 . Lawrence, D., Shahrokh, Z., Marsters, S., Achilles, K., Shih, D. Mounho, B., 
Hillan, K., Totpal, K. DeForge, L., Schow, P., Hooley, J., Sherwood, S., Pai, R., 
Leung, S., Khan, L., Gliniak, B., Bussiere, J., Smith, C, Strom, S., Kelley, S., 
Fox, J., Thomas, D., and Ashkenazi. A. Differential hepatocyte toxicity of 
recombinant Apo2L/TRAIL versions. Nature Med. 7, 383-385 (2001). 

62. Chuntharapai, A., Dodge, K., Grimmer, K., Schroeder, K., Martsters, S.A., 
Kneppen. H.. Ashkenazi. A ., and Kim. K.J. Isotype-dependent inhibition of 
tumor growth in vivo by monoclonal antibodies to death receptor 4. J. Immunol. 
166, 4891-4898 (2001). 

63. Pollack, I.F., Erff, M., and AshkenazLA. Direct stimulation of apoptotic 
signaling by soluble Apo2L/tumor necrosis factor-related apoptosis-inducing 
ligand leads to selective killing of glioma cells. Clin. Cancer Res. 7, 1362-1369 
(2001). 

64. Wang, H., Marsters, S. A., Baker, T., Chan, B., Lee, W.P., Fu, L., Tumas, D., Yan, 
M., Dixit, V.M., * Ashkenazi. A ., and *Grewal, I.S. TACI-hgand interactions are 
required for T cell activation and collagen-induced arthritis in mice. Mature 
Immunol. 2, 632-637 (2001). 

65. Kischkel, F.C., Lawrence, D. A., Tinel, A., Virmani, A., Schow, P., Gazdar, A., 
Blenis, J., Amott, D., and Ashkenazi. A . Death receptor recruitment of 
endogenous caspase-10 and apoptosis initiation in the absence of caspase-8. J. 
Biol. Chem. 276, 46639-46646 (2001). 

66. LeBlanc, H., Lawrence, D.A., Varfolomeev, E., Totpal, K., Morlan, J., Schow, P., 
Fong, S., Schwall, R., Sinicropi, D., and Ashkenazi. A T umor cell resistance to 
death receptor induced apoptosis through mutational inactivation of the 
proapoptotitc Bcl-2 homolog Bax. Nature Med. 8, 274-28 1 (2002). 

67. Miller, K., Meng, G., Liu, J., Hurst, A., Hsei, V., Wong, W-L., Ekert, R., 
Lawrence, D., Sherwood, S., DeForge, L., Gaudreault., Keller, G., Sliwkowski, 
M., Ashkenazi. A ., and Presta, L. Design, Construction, and analyses of 
multivalent antibodies. J. Immunol. 170, 4854-4861 (2003). 
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68. Varfolomeev, E., Kischkel, F., Martin, F., Wanh, H., Lawrence, D., Olsson, C, 
Tom, L., Erickson, S., French, D., Schow, P., Grewal, 1. and Ashkenazi, A. 
Immune system development in APRn. knockout mice. Submitted, 

Review articles: 

1 . Ashkenazi, A.. Peralta, E., Winslow, J,, Ramachandran, J., and Capon, D., J. 
Functional role of muscarinic acetylcholine receptor subtype diversity. Cold 
Spring Harbor Symposium on Quantitative Biology, LIII, 263-272 (1988). 

2. Ashkenazi, A ., Peralta, E., Winslow, J., Ramachandran, J., and Capon, D. 
Functional diversity of muscarinic receptor subtypes in cellular signal 
transduction and growth. Trends Pharmacol Sci. Dec Supplement, 12-21 (1989). 

3. Chamow, S., DuUege, A., Ammann, A., Kahn, J., Allen, D., Eichberg, J., Byrn, 
R., Capon, D., Ward, R., and Ashkenazi, A . CD4 unmunoadhesins in anti-HIV 
therapy: new developments. Int. 1 Cancer Supplement 7, 69-72 (1992). 

4. Ashkenazi, A ., Capon, and D. Ward, R. Immunoadhesins. Int, Rev. ImmunoL 10, 
217-225 (1993). 

5. Ashkenazi. A ., and Peralta, E. Muscarinic Receptors. In Handbook of Receptors 
and Channels. (S. Peroutka, ed.), CRC Press, Boca Raton, Vol. I, p. 1-27, (1994). 

6. Krantz, S. B., Means, R. T., Jr., Lma, J., Marsters, S. A., and Ashkenazi, A. 
Inhibition of erythroid colony formation in vitro by gamma interferon. In 
Molecular Biology of Hematopoiesis (N. Abraham, R. Shadduck, A. Levine F. 
Takaku, eds.) Intercept Ltd. Paris, Vol. 3, p. 135-147 (1994). 

7. Ashkenazi, A . Cytokine neutrahzation as a potential therapeutic approach for 
SIRS and shock. J. Biotechnology in Healthcare 1, 197-206 (1994). 

8. Ashkenazi, A ., and Chamow, S. M. hnmunoadhesiris: an alternative to human 
monoclonal antibodies. Itnmunomethods: A companion to Methods in 
Enzimology 8, 104-1 15 (1995). 

9. Chamow, S., and Ashkenazi, A . Immunoadhesins: Principles and Applications. 
Trends Biotech. 14, 52-60 (1996). 

10. Ashkenazi: A ., and Chamow, S. M. Immunoadhesins as research tools and 
therapeutic agents. Curr, Opin. Immunol. 9, 195-200 (1997). 

11. Ashkenazi, A ., and Dixit, V. Death receptors: signaling and modulation. Science 
281,1305-1308(1998). 

12. Ashkenazi, A ., and Dixit, V. Apoptosis control by death and decoy receptors, 
Cwrr Opin. Cell. Biol. 11, 255-260 (1999). 
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13. Ashkenazi. A . Chapters on Apo217TRAIL; DR4, DR5. DcRl, DcR2; and DcR3. 
Online Cytokine Handbook (www.aDnet. com/cvtokinereference/). 

14. Ashkenazi. A . Targeting death and decoy receptors of the tumor necrosis factor 
superfamily. Nature Rev. Cancer 2, 420-430 (2002). 

15. LeBlanc, H. and Ashkenazi. A . Apoptosis signaling by Apo2UTRAJL. Cell Death 
and Differentiation \0, 66-75 {2003). 

16. Ahnasan, A. and Ashkenazi. A . Apo2I7TRAIL: apoptosis signaUng, biology, and 
potential for cancer therapy. Cytokine and Growth Factor Reviews 14, 337-348 
(2003). 



Book: 



Antibody Fusion Proteins (Chamow, S., and Ashkenazi, A., eds., John Wiley and 
Sons Inc.) (1999). ... 



Talks: 

1. Resistance of primary HIV isolates to CD4 is independent of CD4-gpl20 binding 
affinity. UCSD Symposium, HIV Disease: Pathogenesis and Therapy. 
Greenelefe, FL, March 1991. 

2. Use of immuno-hybrids to extend the half-life of receptors. IBC conference on 
Biopharmaceutical Halflife Extension. New Orleans, LA, June 1992. 

3. Results with TNF receptor hnmunoadhesins for the Treatment of Sepsis. EBC 

conference on Endotoxemia and Sepsis. Philadelphia, PA, June 1992. 

4. hnmunoadhesins: an alternative to human antibodies. IBC conference on 
Antibody Engineering. San Diego, CA, December 1993. 

5. Tumor necrosis factor receptor: a potential therapeutic for human septic shock. 
American Society for Microbiology Meeting, Atlanta, GA, May 1993. 

6. Protective efficiacy of TNF receptor immunoadhesin vs anti-TNF monoclonal 
antibody in a rat model for endotoxic shock. 5th hitemational Congress on TNF. 
Asilomar, CA, May 1994. 

7. hiterferon-Y signals via a multisubunit receptor complex that contains two types of 
polypeptide chain. American Association of hnraunologists Conference. San 
Franciso, CA, July 1995. 

8. hnmunoadhesins: Principles and AppUcations. Gordon Research Conference on 
Drug Delivery in Biology and Medicine. Ventura, CA, February 1996. 
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9. Apo-2 Ligand, a new member of the TNF family that induces apoptosis in tumor 
cells. Cambridge Symposium on TNF and Related Cytokines in Treatment of 
Cancer. Hilton-Head, NC, March 1996. 

10. Induction of apoptosis by Apo2 Ligand. American Society for Biochemistry and 
Molecular Biology, Symposium on GroAVth Factors and Cytokine Receptors. New 

Orleans, LA, June, 1996. 

11. Apo2 ligand, an extracellular trigger of apoptosis. 2nd Clontech Symposium, 

Palo Alto, CA, October 1996. 

12. Regulation of apoptosis by members of the TNF ligand and receptor famiUes. 
Stanford University School of Medicine, Palo Alto, CA, December 1996. 

13. Apo-3: anovel receptor that regulates cell death and inflammation. 4th 
Intemational Congress on Immune Consequences of Trauma, Shock, and Sepsis. 
Munich, Germany, March 1997. 

14. New members of the TNF ligand and receptor famihes that regulate apoptosis. 
inflammation, and immunity. UCLA School of Medicine. LA, CA, March 1997. 

15. hmnunoadhesins: an alternative to monoclonal antibodies. 5th World Conference 
on Bispecific Antibodies. Volendam, Holland, June 1997. 

16. Control of Apo2L signaling. Cold Spring Harbor Laboratory Symposium on 
Programmed Cell Death. Cold Spring Harbor. New York. September. 1997. 

17. Chairman and speaker. Apoptosis Signaling session. IBC's 4th Annual 
Conference on Apoptosis. San Diego, CA., October 1997. 

18. Control of Apo2L signaling by death and decoy receptors. American Association 
for the Advancement of Science. PhiUadelphia, PA, February 1998. 

19. Apo2 ligand and its receptors. American Society of Immunologists. San 
Francisco, CA, April 1998. 

20. Death receptors and Ugands. 7th Intemational TNF Congress. Cape Cod, MA, 

May 1998. 

21. Apo2L as a potential therapeutic for cancer. UCLA School of Medicine. LA, 
. CA, June 1998. 

22. Apo2L as a potential therapeutic for cancer. Gordon Research Conference on 
Cancer Chemotherapy. New London, NH, July 1998. 

23; Control of apoptosis by Apo2L. Endocrine Society Conference, Stevenson, WA, 
August 1998. 

24. Control of apoptosis by Apo2L. Intemational Cytokine Society Conference, 
Jemsalem, Israel, October 1 998 . 
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25 . Apoptosis control by death and decoy receptors. American Association for 
Cancer Research Conference, Whistler, BC, Canada, March 1 999. 

26. Apoptosis control by death and decoy receptors. American Society for 
Biochemistry and Molecular Biology Conference, San Francisco, CA, May 1999. 

27. Apoptosis control by death and decoy receptors. Gordon Research Conference on 
Apoptosis, New London, NH, June 1999. 

28. Apoptosis control by death and decoy receptors. Arthritis Foundation Research 
Conference, Alexandria GA, Aug 1999. 

29. Safety and anti-tumor activity of recombinant soluble Apo2L/TRAIL. Cold 
Spring Harbor Laboratory Symposium on Programmed Cell Death. . Cold Spring 
Haibor, MY, September 1999. 

30. The Apo2L/TRAIL system: therapeutic potential. American Association for 
Cancer Research, Lake Tahoe, NV, Feb 2000. 

31 . Apoptosis and cancer therapy. Stanford University School of Medicine, Stanford, 
CA, Mar 2000. 

32. Apoptosis and cancer therapy. University of Pennsylvania School of Medicine, 
Philladelphia, PA, Apr 2000. 

33. Apoptosis signaling by Apo2UTRAIL. International Congress on TNF. 
Trondheim, Norway, May 2000. 

34. The Apo2LrrRAIL system: therapeutic potential. Cap-CURE summit meeting. 
Santa Monica, CA, June 2000. 

35 . The Apo2L/TRAIL system: therapeutic potential. MD Anderson Cancer Center. 
Houston, TX, June 2000. 

36. Apoptosis signaling by Apo2L/TRAJL. The Protein Society, 14'" Symposium. 
San Diego, CA, August 2000. 

37. Anti-tumor activity of Apo2I7TRAIL. AAPS annual meeting. IndianapoUs, IN 

Aug 2000. 

38. Apoptosis signaling and anti-cancer potential of Apo2UTRAE.. Cancer Research 
Institute, UC San Francisco, CA, September 2000. 

39. Apoptosis signaling by Apo2Ln'RAIL. Kenote address, TNF family 
Minisymposium, NIH. Bethesda, MD, September 2000. 

40. Death receptors: signaling and modulation. Keystone symposium on the 
Molecular basis of cancer. Taos, NM, Jan 2001. 

41 . Preclinical studies of Apo2L/TRAIL in cancer. Symposium on Targeted therapies 
in the treatment of lung cancer. Aspen, CO, Jan 2001. 
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42. Apoptosis signaling by Apo2L/TRAIL. Wiezmarai Institute of Science, Rehovot, 
Israel, March 2001. 

43. Apo2LyTRAIL: Apoptosis signaling and potential for cancer therapy. Weizmann 
Institute of Science, Rehovot, Israel, March 2001. 

44. Targeting death receptors in cancer with Apo2L/TRAIL. Cell Death and Disease 

conference, North Falmouth, MA, Jun 200 1 . 

45 . Targeting death receptors in cancer with Apo2L/TRAIL. Biotechnology 
Organization conference, San Diego, CA, Jun 2001. 

46. Apo2L/rRAIL signaling and apoptosis resistance mechanisms. Gordon Research 
Conference on Apoptosis, Oxford, UK, July 2001 . 

47. Apo2L/TRAIL signaling and apoptosis resistance mechanisms. Cleveland Clinic 
Foundation, Cleveland, OH, Oct 2001. 

48 . Apoptosis signaling by death receptors: overview. International Society for 
Interferon and Cytokine Research conference, Cleveland, OH, Oct 2001. 

' 49 . • Apoptosis signaling by death receptors. American Society of Nephrology 
Conference. San Francisco, C A, Oct 200 1 . 

50. Targeting death receptors in cancer. Apoptosis: commercial opportunities. San 
Diego, CA, Apr 2002. 

51. Apo2L/TRAIL signaling and apoptosis resistance mechanisms. Kimmel Cancer 
Research Center, Johns Hopkins University, Baltimore MD. May 2002. 

52. Apoptosis control by Apo2L/TRAIL. (Keynote Address) University of Alabama 
Cancer Center Retreat, Birmingham, Ab. October 2002. 

53 . Apoptosis signaling by Apo2L/TRAIL. (Session co-chair) TNF international, 
conference. San Diego, CA. October 2002. 

54. Apoptosis signaling by Apo2L/TRAIL. Swiss Institute for Cancer Research 
(ISREC). Lausanne, Swizerland. Jan 2003. 

5 5 . Apoptosis induction with Apo2L/TRAIL. Conference on New Targets and 
Innovative Strategies in Cancer Treatment. Monte Carlo. February 2003. 

56. Apoptosis signaling by Apo2L/TRAIL. HermeUn Brain Tumor Center 
Symposium on Apoptosis. Detroit, MI. April 2003. 

57. Targeting apoptosis through death receptors. Sixth Annual Conference on 
Targeted Therapies in the Treatment of Breast Cancer. Kona, Hawaii. July 2003. 

58. Targeting apoptosis through death receptors. Second International Conference on 
Targeted Cancer Therapy. Washington, DC. Aug 2003. 

Issued Patents: 
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Ashkenazi. A., Chamow, S. and Kogan. T. Carbohydrate-directed crosslinking 
reagents. US patent 5,329,028 (Jul 12, 1994)'. 

Ashkenazi, A., Chamow, S. and Kogan, T, Carbohydrate-directed crosslinking 
reagents. US patent 5,605,791 (Feb 25, 1997). 

Ashkenazi, A., Chamow, S. and Kogan, T. Carbohydrate-directed crossUnking 
reagents. US patent 5,889,155 (Jul 27, 1999). 
Ashkenazi, A., APO-2 Ligand. US patent 6,030,945 (Feb 29. 2000). 
Ashkenazi, A., Chuntharapai, A., Kim, J., APO-2 Ugand antibodies. US patent 6, 
046, 048 (Apr 4, 2000). 

Ashkenazi, A., Chamow, S. and Kogan, T. Carbohydrate-directed crosslmking 
reagents. US patent 6,124,435 (Sep 26, 2000). 

Ashkenazi, A., Chuntharapai, A., Kim, J., Method for making monoclonal and cross- 
reactive antibodies. US patent 6,252,050 (Jun 26, 2001). 
Ashkenazi, A. APO-2 Receptor. US patent 6,342,369 (Jan 29, 2002). 
Ashkenazi, A. Fong, S., Goddard, A., Gumey, A., Napier, M., Tumas, D., Wood, W. 
A-33 polypeptides. US patent 6,410,708 (Jun 25, 2002). 
Ashkenazi, A. APO-3 Receptor. US patent 6,462,176 Bl (Oct 8, 2002). 
Ashkenazi, A. AP0-2LI and APO-3 polypeptide antibodies. US patent 6,469,144 Bl 
(Oct 22, 2002). 

Ashkenazi, A., Chamow, S. and Kogan, T. Carbohydrate-directed crosslinkmg 
reagents. US patent 6,582,928B1 (Jun 24, 2003). 
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THE UNTTED STATES PATENT AND TRADEMARK OFFICE 



PATENT 



In re Application of: Ashkenazi et al. 



Group Art Unit: 1647 



Serial No.: 09/903,925 



Examiner: Fozia Hamid 



Filed: July 11,2001 



For: SECRETED AND 



TRANSMEMBRANE 
POLYPEPTIDES AND NUCLEIC 
ACIDS 




T)T.rT .ARATION OF AUDREY D. GODDARD. Ph.D UNDER 37 C.F.R. S 1.132 

Assistant Commissioner of Patents 
Washington, D.C. 20231 

Sir: 

1, Audrey D. Goddard, Ph.D. do hereby declare and say as follows: 

1 . I am a Senior Clinical Scientist at the Experimental Medicine/BioOncology, Medical 
Affairs Department of Genentech, Inc., South San Francisco, California 94080. 

2. Between 1 993 and 200 1 , 1 headed the DNA Sequencing Laboratory at the Molecular 
Biology Department of Genentech, Inc. During this time, my responsibilities included the 
identification and characterization of genes contributing to the oncogenic process, and detennination 
of the chromosomal localization of novel genes. 

3 . My scientific Curriculum Vitae, including my list of publications, is attached to and 
forms part of this Declaration (Exhibit A). 
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Serial No.: * 
Filed: * 

4. I am familiar with a variety of techniques known in the art for detecting and 
quantifying the amplification of oncogenes in cancer, including the quantitative TaqMan PCR (i.e., 
"gene amplification") assay described in the above captioned patent application. 

5. The TaqMan PCR assay is described, for example, in the following scientific 
publications: Higuchi et al. Biotechnology 10:413-417 (1992) (Exhibit B); Livak et al, PCR 
Methods Appl.. 4:357-362 (1995) (Exhibit C> and Heid et al. Genome Res. 6:986-994 (1996) 
(Exhibit D). Briefly, the assay is based on the principle that successful PCR yields a fluorescent 
signal due to Taq DNA polymerase-mediated exonuclease digestion of a fluorescently labeled 
oligonucleotide that is homologous to a sequence between two PCR primers. The extent of 
digestion depends directly on the amount of PCR, and can be quantified accurately by measuring the 
increment in fluorescence that results from decreased energy transfer. This is an extremely sensitive 
technique, which allows detection in the exponential phase of the PCR reaction and, as a result, 
leads to accurate detemination of gene copy number. 

6. The quantitative fluorescent TaqMan PCR assay has been extensively and 
successfully used to characterize genes involved in cancer development and progression. 
Amplification of protooncogenes has been studied in a variety of human tumors, and is widely 
considered as having etiological, diagnostic and prognostic significance. This use of the quantitative 
TaqMan PCR assay is exemplified by the following scientific publications: Pennica et al, Proc. 
Natl. Acad. Sci. USA . 95(25):14717-14722 (1998) (Exhibit E); Pitti et al. Nature 
396(67 12);699-703 (1998) (Exhibit F) and Bieche et al. Int. J. Cancer 78:661-666 (1998) (Exhibit 
G), the first two of which I am co-author. In particular, Pennica et al have used the quantitative 
TaqMan PCR assay to study relative gene amplification of WISP and c-myc in various cell lines, 
colorectal tumors and normal mucosa. Pitti et al studied the genomic amplification of a decoy 
receptor for Fas ligand in lung and colon cancer, using the quantitative TaqMan PCR assay. Bieche 
et al used the assay to study gene amplification in breast cancer. 
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Serial No.: * 
Filed: * 

7. It is my personal experience that the quantitative TaqMan PCR technique is 
technically sensitive enough to detect at least a 2-fold increase in gene copy number relative to 
control It is further my considered scientific opinion that an at least 2-fold increase in gene copy 
number in a tumor tissue sample relative to a normal (i.e., non-tumor) sample is significant and 
useful in that the detected increase in gene copy number in the tumor sample relative to the normal 
sample serves as a basis for using relative gene copy number as quantitated by the TaqMan PCR 
technique as a diagnostic marker for the presence or absence of tumor in a tissue sample of unknown - 
pathology. Accordingly, a gene identified as being amplified at least 2-fold by the quantitative 
TaqMan PCR assay in a tumor sample relative to a normal sample is useful as a marker for the 
diagnosis of cancer, for monitoring cancer development and/or for measuring the eflBcacy of cancer 
therapy. 

8. r declare further that all statements made herein of my own knowledge are true and 
that all statements made on information and belief are believed to be true. I declare that these 
statements were made with the knowledge that willfiil false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code, and that such willful false statements may jeopardize the vaHdity of the application or any 
patent issuing thereon. 




Date 



Audrey D. Goddard, Ph.D. 
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AUDREY D. GODDARD, Ph.D. 



Genentech. Inc. 110 Congo St. 

1 DNA Way Francisco, CA, 94131 

South San Francisco, CA, 94080 415.841.9154 

650 225 6429 415.819.2247 (mobile) 

goddarda@gene.com agoddard@pacbell.net 



PROFESSIONAL EXPERIENCE 

Genentech. Inc. 1993-present 
South San Francisco, CA 
2001 - present Senior Clinical Scientist 

Experimental Medicine / BioOncology. Medical Affairs 

Responsibilities: 

• Companion diagnostic oncology products 

• Acquisition of clinical samples from Genentech's clinical trials for translational research 

• Translational researcti using clinical specimen and data for drug development and 

diagnostics • o-i i-co 

• Member of Development Science Review Committee, Diagnostic Oversight Team, 21 CFR 

Part 1 1 Subteam 

Interests: , ... 

• Ethical and legal implications of experiments with clinical specimens and data 

• Application of pharmacogenomics in clinical trials 



1998 -2001 Senior Scientist 

Head of the DNA Sequencing Laboratory, Molecular Biology Department, Research 

Responsibilities: , . ,r • * 

. Management of a laboratory of up to nineteen --including postdoctoral fellow, associate 
scientist, senior research associate and research assistants/associate levels 

• Management of a $750K budget 

. DNA sequencing core facility supporting a 350+ person research facility. 

. DNA sequencing for high throughput gene discovery, - ESTs, cDNAs, and constructs 

• Genomic sequence analysis and gene identification 

• DNA sequence and primary protein analysis 

Research: 

• Chromosomal localization of novel genes 

. Identification and characterization of genes contributing to the oncogenic process 

• Identification and characterization of genes contributing to inflammatory diseases 

• Design and development of schemes for high throughput genomic DNA sequence analysis 

• Candidate gene prediction and evaluation 
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1993 - 1998 Scientist 

Head of the DNA Sequencing Laboratory, Molecular Biology Department, Research 
Responsibilities 

• DNA sequencing core facility supporting a 350+ person researcli facility 

• Assumed responsibility for a pre-existing team of five technicians and expanded the group 
into fifteen, introducing a level of middle management and additional areas of research 

• Participated in the development of the basic plan for high throughput secreted protein 
discovery program - sequencing strategies, data analysis and tracking, database design 

• High throughput EST and cDNA sequencing for new gene identification, 

• Design and implementation of analysis tools required for high throughput gene identification, 

• Chromosomal localization of genes encoding novel secreted proteins. 

Research: 

• Genomic sequence scanning for new gene discovery. 

• Development of signal peptide selection methods. 

• Evaluation of candidate disease genes. 

• Growth hormone receptor gene SNPs in children with Idiopathic short stature 



Imperial Cancer Research Fund 1989-1992 
London, UK with Dr. Ellen Solomon 

6/89-12/92 Postdoctoral Fellow 

• Cloning and characterization of the genes fused at the acute promyelocytic leukemia 
translocation breakpoints on chromosomes 17 and 15. 

• Prepared a successfully funded European Union multi-center grant application 



McMaster University 

Hamilton, Ontario, Canada with Dr. G. D. Sweeney 
5/83 - 8/83: NSERC Summer Student 

• In vitro metabolism of p-naphthoflavone in C57B1/6J and DBA mice 



EDUCATION 



Ph.D. 

"Phenotypic and genotypic effects of mutations in 
the human retinoblastoma gene." 
Supervisor: Dr. R. A. Phillips 

Honours B.Sc 

"The in vitro metabolism of the cytochrome P-448 
inducer (j-naphthoflavone in C57BL/6J mice." 
Supervisor: Dr. G. D. Sweeney 



University of Toronto 
Toronto, Ontario, Canada. 
Department of Medical 
Biophysics. 

McMaster University, 
Hamilton, Ontario, Canada. 
Department of Biochemistry 



1989 



1983 
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ACADEMIC AWARDS 



Imperial Cancer Research Fund Postdoctoral Fellowship 

Medical Research Council Studentship 

NSERC Undergraduate Summer Research Award 

Society of Chemical Industry Merit Award (Hons. Biochem.) 

Dr. Harry Lyman Hooker Scholarship 

J.L.W. Gill Scholarship 

Business and Professional Women's Club Scholarship 
Wyerhauser Foundation Scholarship 



1989-1992 
1983-1988 
1983 



1983 



1981-1983 
1981-1982 
1980-1981 
1979-1980 



INVITED PRESENTATIONS 

Genentech's gene discovery pipeline: High throughput identification, cloning and 
characterization of novel genes. Functional Genomics: From Genome to Function, Litchfield 
Park, AZ, USA. October 2000 

High throughput identification, cloning and characterization of novel genes. G2K:Back to 
Science, Advances in Genome Biology and Technology I. Marco Island, FL, USA. February 



Quality control in DNA Sequencing: The use of Phred and Phrap. Bay Area Sequencing 
Users Meeting, Berkeley, CA. USA. April 1999 

High throughput secreted protein identification arid cloning. Tenth International Genome 
Sequencing and Analysis Conference, Miami. FL. USA. September 1998 

The evolution of DNA sequencing: The Genentech perspective. Bay Area Sequencing Users 
Meeting, Berkeley, CA, USA. May 1998 

Partial Growth Hormone Insensitivity: The role of GH-receptor mutations in Idiopathic Short 
Stature. Tenth Annual National Cooperative Growth Study Investigators Meeting, San 
Francisco, CA, USA. October. 1996 

Growth hormone (GH) receptor defects are present in selected children with non-GH-deficient 
short stature: A molecular basis for partial GH-insensitivity. 76*^ Annual Meeting of The 
Endocrine Society. Anaheim, CA. USA. June 1994 

A previously uncharacterized gene, myl. is fused to the retinoic acid receptor alpha gene in 
acute promyelocytic leukemia. XV International Association for Comparative Research on 
Leukemia and Related Disease. Padua, Italy. October 1991 



2000 
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PATENTS 

Goddard A, Godowski PJ. Gurney AL. NL2 Tie ligand homologue polypeptide. Patent 
Number: 6,455.496. Date of Patent: Sept. 24, 2002. 

Goddard A, Godowsi<i PJ and Gurney AL. NL3 Tie ligand homologue nucleic acids. Patent 
Number: 6,426,218. Date of Patent: July 30, 2002. 

Godowski P, Gurney A, Hillan KJ, Botstein D. Goddard A, Roy M, Ferrara N. Tumas D. 
Schwall R. NL4 Tie ligand homologue nucleic acid. Patent Number: 6,4137,770. Date of 
Patent: July 2, 2002. 

Ashkenazi A, Fong S, Goddard A, Gurney AL, Napier MA, Tumas D, Wood Wl. Nucleic acid 
encoding A-33 related antigen poly peptides. Patent Number: 6,410,708. Date of Patent:: 
Jun. 25. 2002. 

Botstein DA. Cohen RL, Goddard AD, Gurney AL, Hillan KJ. Lawrence DA. Levine AJ, 
Pennica D. Roy MA and Wood Wl. WISP polypeptides and nucleic acids encoding same. 
Patent Nurriber: 6,387,657. Date of Patent: May 14, 2002. 

Goddard A, Godowski PJ and Gurney AL. Tie ligands. Patent Number: 6,372,491. Date of 
Patent: April 16, 2002. 

Godowski PJ, Gurney AL. Goddard A and Hillan K. TIE ligand homologue antibody. Patent 
Number: 6.350,450. Date of Patent: Feb. 26, 2002. 

Fong S, Ferrara N, Goddard A, Godowski PJ, Gurney AL. Hillan K and Williams PM. Tie 
receptor tyrosine kinase ligand homologues. Patent Number: 6,348.351 . Date of Patent: 
Feb. 19. 2002. 

Goddard A. Godowski PJ and Gumey AL. Ligand homologues. Patent Number: 6.348.350. 
Date of Patent: Feb. 19, 2002. 

Attie KM Carlsson LMS. Gesundheit N and Goddard A. Treatment of partial growth 
hormone insensitivity syndrome. Patent Number: 6.207,640. Date of Patent: March 27, 
2001. 

Fong S. Ferrara N, Goddard A, Godowski PJ. Gurney AL. Hillan K and Williams PM. Nucleic 
acids encoding NL-3. Patent Number: 6.074.873. Date of Patent: June 13, 2000 

Attie K Carlsson LMS, Gesunheit N and Goddard A. Treatment of partial growth hormone 
insensitivity syndrome. Patent Number: 5,824,642. Date of Patent: October 20, 1998 

Attie K. Carlsson LMS, Gesunheit N and Goddard A. Treatment of partial growth honmone 
insensitivity syndrome. Patent Number: 5,646,1 13. Date of Patent: July 8, 1997 

Multiple additional provisional applications filed 



r 
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PUBLICATIONS 

Seshasayee D, Dowd P, Gu Q, Erickson S, Goddard AD Comparative sequence analysis of 
the HER2 locus in mouse and man. Manuscript in preparation. 

Abuzzahab MJ, Goddard A, Grigorescu F, Lautier C, Smith RJ and Chernausek SD. Human 
IGF-1 receptor mutations resulting in pre- and post-natal growth retardation. Manuscript in 
preparation. 

Aggarwai S, Xie, M-H, Foster J, Frantz G. Stinson J, Corpuz RT, Simmons L, Hillan K. 
Yansura DG, Vandlen RL, Goddard AD and Gurney AL. FHFR, a novel receptor for the 
fibroblast growth factors. Manuscript submitted. 

Adams SH, Chui C, Schilbach SL, Yu XX, Goddard AD, Grimaldi JC, Lee J, Dowd P, Colman 
S., Lewin DA. (2001) BFIT, a unique acyl-CoA thioesterase induced in thermogenic brown 
adipose tissue: Cloning, organization of the human gene, and assessment of a potential link 
to obesity. Biochemical Journal 360: 1 35-142. 

Lee J. Ho WH. Maruoka M. Corpuz RT. Baldwin DT. Foster JS. Goddard AD. Yansura DG. 
Vandlen RL. Wood WL Gurney AL. (2001) IL-17E, a novel proinflammatory ligand for the IL- 
17 receptor homolog IL-17Rh1. Journal of Biological Chemistry 276{2): 1660-1664. 

Xie M-H, Aggarwai S, Ho W-H, Foster J, Zhang Z, Stinson J, Wood Wl, Goddard AD and 
Gurney AL. (2000) Interleukin (IL)-22, a novel human cytokine that signals through the 
interferon-receptor related proteins CRF2-4 and IL-22R. Journal of Biological Chemistry 275: 
31335-31339. 

Weiss GA, Watanabe CK, Zhong A, Goddard A and Sidhu SS. (2000) Rapid mapping of 
protein functional epitopes by combinatorial alanine scanning. Proc. Natl. Acad. Sc/. USA 97: 
8950-8954. 

Guo S, Yamaguchi Y, Schilbach S, Wada T.;Lee J, Goddard A, French D , Handa H, 
Rosenthal A. (2000) A regulator of transcriptional elongation controls vertebrate neuronal 
development. Nature 408: 366-369. 

Yan M, Wang L-C. Hymowitz SG, Schilbach S, Lee J, Goddard A, de Vos AM, Gao WQ, Dixit 
VM. (2000) Two-amino acid molecular switch in an epithelial morphogen that regulates 
binding to two distinct receptors. Science 290: 523-527. 

Sehl PD, Tai JTN, Hillan KJ, Brown LA, Goddard A, Yang R, Jin H and Lowe DG. (2000) 
Application of cDNA microarrays in determining molecular phenotype in cardiac growth, 
development, and response to injury. Circulation 101: 1990-1999, 

Guo S, Brush J, Teraoka H, Goddard A, Wilson SW, Mullins MC and Rosenthal A. (1999) 
Development of noradrenergic neurons in the zebrafish hindbrain requires BMP, FGF8, and 
the homeodomain protein soulless/Phox2A. Neuron 24: 555-566. 

Stone D, Murone, M, Luoh, S, Ye W, Armanini P, Gurney A, Phillips HS. Brush, J, Goddard 
A, de Sauvage FJ and Rosenthal A. (1999) Characterization of the human suppressor of 
fused; a negative regulator of the zinc-finger transcription factor Gli. J. Cell Sci. 112: 4437- 
4448.* 

Xie M-H, Holcomb I, Deuel B, Dowd P, Huang A, Vagts A, Foster J, Liang J, Brush J, Gu Q, 
Hillan K,' Goddard A and Gurney, A.L. (1999) FGF-19, a novel fibroblast growth factor with 
unique specificity for FGFR4. Cytokine 11: 729-735. 



I 
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Yan M, Lee J, Schilbach S, Goddard A and Dixit V. (1999) mEIO. a novel caspase 
recruitment domain-containing proapoptotic molecule. J. Biol. Chem. 274(15): 10287-10292. 

Gurney AL, Marsters SA, Huang RM. Pitti RM, Mark DT, Baldwin DT, Gray AM, Dowd P, 
Brush J, Heldens S, Schow P, Goddard AD. Wood Wl. Baker KP. Godowski PJ and 
Ashkenazi A. (1999) Identification of a new member of the tumor necrosis factor family and its 
receptor, a human ortholog of mouse GITR. Current Biology 9(4): 215-218. 

Ridgway JBB. Ng E. Kern JA ,Lee J. Brush J, Goddard A and Carter P. (1999) Identification 
of a human anti-CD55 single-chain Fv by subtractive panning of a phage library using tumor 
and nontumor cell lines. Cancer Research 59: 2718-2723. 

Pitti RM. Marsters SA, Lawrence DA, Roy M. Kischkel FC, Dowd P. Huang A, Donahue CJ, 
Shenwood SW, Baldwin DT. Godowski PJ, Wood Wl, Gurney AL, Hillan KJ, Cohen RL. 
Goddard AD, Botstein D and Ashkenazi A. (1998) Genomic amplification of a decoy receptor 
for Fas ligand in lung and colon cancer. Nature 396(6712): 699-703. 

Pennica D, Swanson TA. Welsh JW, Roy MA, Lawrence DA, Lee J, Brush J, Taneyhill LA. 
Deuel B, Lew M, Watanabe C, Cohen RL, Melhem MF, Finley GG. Quirke P, Goddard AD, 
Hillan KJ, Gurney AL, Botstein D and Levine AJ. (1998) WISP genes are members of the 
connective tissue growth factor family that are up-regulated in wnt-1 -transformed cells and 
aben-antly expressed in human colon tumors. Proc. Natl. Acad. Sci. USA. 95(25): 14717- 
14722. 

Yang RB, Mark MR, Gray A, Huang A, Xie MH, Zhang M, Goddard A. Wood Wl, Gurney AL 
and Godowski PJ. (1998) Toll-like receptor-2 mediates lipopolysaccharide-induced cellular 
signalling. Nature 395(6699): 284-288. 

Merchant AM. Zhu Z. Yuan JQ. Goddard A, Adams CW, Presta LG and Carter P. (1998) An 
efficient route to human bispecific IgG. Nature Biotechnology 16(7): 677-681. 

Marsters SA, Sheridan JP. Pitti RM, Brush J, Goddard A and Ashkenazi A. (1998) 
Identification of a ligand for the death-domain-containing receptor Apo3. Current Biology 8(9): 
525-528. 

Xie J Murone M, Luoh SM, Ryan A, Gu Q. Zhang C, Bonifas JM, Lam CW, Hynes M, 
Goddard A, Rosenthal A, Epstein EH Jr. and de Sauvage FJ. (1998) Activating Smoothened 
mutations in sporadic basal-cell carcinoma. Nature. 391(6662): 90-92. 

Marsters SA, Sheridan JP, Pitti RM, Huang A, Skubatch M, Baldwin D, Yuan J, Gurney A, 
Goddard AD, Godowski P and Ashkenazi A. (1997) A novel receptor for Apo2L/TRAIL 
contains a tmncated death domain. Cun-ent Biology. 7(12): 1003-1006. 

Hynes M, Stone DM. Dowd M. Pitts-Meek S, Goddard A. Gurney A and Rosenthal A. (1997) 
Control of cell pattern in the neural tube by the zinc finger transcription factor Gli-1. Neuron 
19:15-26. 

Sheridan JP, Marsters SA, Pitti RM, Gurney A., Skubatch M, Baldwin D, Ramakrishnan L, 
Gray CL, Baker K, Wood Wl, Goddard AD, Godowski P, and Ashkenazi A. (1997) Control of 
TRAIL-Induced Apoptosis by a Family of Signaling and Decoy Receptors. Science 277 
(5327): 818-821. 
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Goddard AD Dowd P. Chernausek S, Geffner M, Gertner J. Hintz R, Hopwood N. Kaplan S, 
Plotnick L Rogol A, Rosenfield R. Saenger P. Mauras N. Hershkopf R, Angulo M and Attie. K. 
(1997) Partial growth hormone insensitivity: The role of growth hormone receptor mutations in 
idiopathic short stature. J. Pediatr. 131: S51-55. 

Klein RD Sherman D. Ho WH, Stone D. Bennett GL, Moffat B, Vandlen R. Simmons L. Gu Q. 
Honao JA Devaux B, Poulsen K. Annanini M, Nozaki C, Asai N, Goddard A, Phillips H, 
Henderson CE. Takahashi M and Rosenthal A. (1997) A GPI-linked protein that interacts with 
Ret to form a candidate neurturin receptor. Nature. 387(6634): 717-21. 
Stone DM Hynes M. Armanini M. Swanson TA. Gu Q, Johnson RL. Scott MP. Pennica D. 
Goddard A, Phillips H, Noll M, Hooper JE, de Sauvage F and Rosenthal A (1996) The 
tumour-suppressor gene patched encodes a candidate receptor for Sonic hedgehog. Nature 
3B4(6605): 129-34. 

Marsters SA. Sheridan JP, Donahue CJ. Pitti RM, Gray CL. Goddard AD, Bauer KD and 
Ashkenazi A (1996) Apo-3, a new member of the tumor necrosis factor receptor family, 
contains a death domain and activates apoptosis and NF-kappa p. Current Biology 6(12): 
1669-76. 

Rothe M Xiong J, Shu HB, Williamson K, Goddard A and Goeddel DV. (1996) l-TRAF is a 
novel TRAF-interacting protein that regulates TRAF-mediated signal transduction. Proc. Natl. 
Acad. Sci. USA 93: 8241-8246. 

Yang M Luoh SM, Goddard A. Reilly D. Henzel W and Bass S. (1996) The bglX gene 
located at 47.8 min on the Escherichia coli chromosome encodes a penplasmic beta- 
glucosidase. Microbiology 142: 1659-65. 

Goddard AD and Black DM. (1996) Familial Cancer in Molecular Endocrinology of Cancer. 
Waxman, J. Ed. Cambridge University Press. Cambridge UK, pp.1 87-215. 
Treanor JJS, Goodman L. de Sauvage F, Stone DM, Poulson KT. Beck CD. Gray C. Armanini 
MP Pollocks RA, Hefti F. Phillips HS, Goddard A, Moore MW. Buj-Bello A. Davis AM Asai N, 
Takahashi M. Vandlen R, Henderson CE and Rosenthal A. (1996) Characterization of a 
receptor for GDNF. Nature 382: 80-83. 

Klein RD Gu Q. Goddard A and Rosenthal A. (1996) Selection for genes encoding secreted 
proteins and receptors. Proc. Natl. Acad. Sci. USA 93: 7108-7113. 

Winslow JW. Moran P. Valverde J. Shih A. Yuan JQ, Wong SC, Tsai SP, Goddard A, Henzel 
WJ, Hefti F and Caras I. (1995) Cloning of AL-1, a ligand for an Eph-related tyrosine kinase 
receptor involved in axon bundle formation. Neuron 14: 973-981. 
Bennett BD. ZeiglerFC. Gu Q. Fendly B. Goddard AD, Gillett N and Matthews W. (1995) 
Molecular cloning of a ligand for the EPH-related receptor protein-tyrosine kinase Htk. Proc. 
Natl. Acad. Sci. USA 92: 1866-1870. 

Huanq X Yuang J, Goddard A, Foulis A, James RF. Lemmark A, Pujol-Borrell R. 
Rabinovitch A. Somoza N and Stewart TA. (1995) Interferon expression in the pancreases of 
patients with type I diabetes. Diabetes 44: 658-664. 

Goddard AD Yuan JQ, Fairbaim L, Dexter M. Borrow J. Kozak C and Solomon E. (1995) 
Cloning of the murine homolog of the leukemia-associated PML gene. Mammalian Genome 
6: 732-737. 
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Goddard AD. Covello R, Luoh SM, Clackson T, Attie KM, Gesundheit N, Rundle AC, Wells 
JA, Carlsson LMTI and The Growth Hormone Insensitivity Study Group. (1995) Mutations of 
the growth hormone receptor in children with idiopathic short stature. N. Engl. J. Med. 333: 
1093-1098. 

Kuo SS, Moran P, Gripp J, Armanini M, Phillips HS, Goddard A and Caras IW. (1994) 
Identification and characterization of Batk, a predominantly brain-specific non-receptor protein 
tyrosine kinase related to Csk. J. Neurosci. Res. 38: 705-715. 

Mark MR, Scadden DT, Wang Z, Gu Q, Goddard A and Godowski PJ. (1994) Rse. a novel 
receptor-type tyrosine kinase with homology to Axl/Ufo, is expressed at high levels in the 
brain. Journal of Biological Chemistry 2S9: 10720-10728. 

Borrow J, Shipley J, Howe K. Kiely F, Goddard A, Sheer D, Srivastava A, Antony AC, 
Fioretos T, Mitelman F and Solomon E. (1994) Molecular analysis of simple variant 
translocations in acute promyelocytic leukemia. Genes Chromosomes Cancer 9: 234-243. 

Goddard AD and Solomon E. (1993) Genetics of Cancer. Adv. Hum. Genet. 21: 321-376. 

Borrow J, Goddard AD, Gibbons B, Katz F, Swirsky D, Fioretos T, Cube I, Winfield DA, 
Kingston J, Hagemeijer A, Rees JKH, Lister AT and Solomon E. (1992) Diagnosis of acute 
promyelocytic leukemia by RT-PCR: Detection of PML-RARA and RARA-PML fusion 
transcripts. Sr. J. Haematol. 82: 529-540. 

Goddard AD. Borrow J and Solomon E. (1992) A previously uncharacterized gene, PML, is 
fused to the retinoic acid receptor alpha gene in acute promyelocytic leukemia. Leukemia 6 
SuppI 3: 117S-119S. 

Zhu X, Dunn JM, Goddard AD, Squire JA, Becker A, Phillips RA and Gallie BL. (1992) 
Mechanisms of loss of heterozygosity in retinoblastoma. Cytogenet. Cell. Genet. 59: 248-252. 

Foulkes W, Goddard A. and Patel K. (1991) Retinoblastoma linked with Seascale [letter]. 
British Med. J. 302: 409. 

Goddard AD, Borrow J, Freemont PS and Solomon E. (1991) Characterization of a novel zinc 
finger gene disrupted by the t(15;17) in acute promyelocytic leukemia. Science 254: 1371- 
1374. 

Solomon E, Borrow J and Goddard AD. (1991) Chromosomal aberrations in cancer. Science 
254: 1153-1160. 

Pajunen L, Jones TA, Goddard A, Sheer D, Solomon E, Pihiajaniemi T and Kivirikko Kl. 
(1991) Regional assignment of the human gene coding for a multifunctional peptide (P4HB) 
acting as the p-subunit of prolyl-4-hydroxylase and the enzyme protein disulfide isomerase to 
17q25. Cytogenet. Cell. Genet. 56: 165-168. 

Borrow J, Black DM, Goddard AD. Yagle MK, Frischauf A.-M and Solomon E. (1991) 
Construction and regional localization of a Not\ linking library from human chromosome 17q. 
Genomics 10: 477-480. 

Borrow J, Goddard AD. Sheer D and Solomon E. (1990) Molecular analysis of acute 
promyelocytic leukemia breakpoint cluster region on chromosome 17. Science 249: 1577- 
1580. 
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Myers JC, Jones TA, Pohjolainen E-R. Kadri AS, Goddard AD, Sheer D, Solomon E and 
Pihlajanie'mi T. (1990) Molecular cloning of 5(IV) collagen and assignment of the gene to the 
region of the region of the X-chromosome containing the Alport Syndrome locus. Am. J. Hum. 
Genet 46: 1024-1033. 

Gallie BL, Squire JA, Goddard A. Dunn JM, Canton M, Hinton D, Zhu X and Phillips RA. 
(1990) Mechanisms of oncogenesis in retinoblastoma. Lab. Invest. 62: 394-408. 

Goddard AD. Phillips RA, Greger V. Passarge E, Hopping W, Gallie BL and Horsthemke B. 
(1990) Use of the RB1 cDNA as a diagnostic probe in retinoblastoma families. Clinical 
Genetics Z7: 117-126. 

Zhu XP, Dunn JM, Phillips RA, Goddard AD, Paton KE. Becker A and Gallie BL. (1989) 
Germline, but not somatic, mutations of the RBI gene preferentially involve the paternal 
allele. Nature 340: 312-314. 

Gallie BL, Dunn JM, Goddard A, Becker A and Phillips RA. (1988) Identification of mutations 
in the putative retinoblastoma gene. In Molecular Bi oloov of The Eve: Genes. Vision and 
Ocular Disease . UCLA Symposia on Molecular and Cellular Biology, New Series, Volume 88. 
J. Piatigorsky. T. Shinohara and P.S. Zelenka, Eds. Alan R. Liss, Inc., New York, 1988. pp. 
427-436. 

Goddard AD, Balakier H, Canton M, Dunn J. Squire J, Reyes E, Becker A, Phillips RA and 
Gallie BL. (1988) Infrequent genomic rearrangement and normal expression of the putative 
RBI gene in retinoblastoma tumors. Mol. Cell. Biol. 8: 2082-2088. 

Squire J, Dunn J. Goddard A, Hoffman T, Musarella M, Willard HF, Becker AJ. Gallie BL and 
Phillips RA. (1986) Cloning of the esterase D gene: A polymorphic gene probe closely linked 
to the retinoblastoma locus on chromosome 13. Proc. Natl. Acad. Sci. USA 83: 6573-6577. 

Squire J, Goddard AD, Canton M, Becker A, Phillips RA and Gallie BL (1986) Tumour 
inductiori by the retinoblastoma mutation is independent of N-myc expression. Nature 322: 
555-557. 

Goddard AD, Heddle JA, Gallie BL and Phillips RA. (1985) Radiation sensitivity of fibroblasts 
of bilateral retinoblastoma patients as determined by micronucleus induction in vitro. Mutation 
Research ^52•. 31-38. 
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jnULTANiOUS l»limri€ATiON AND DilfOIOH Q 
iPiaFK MIA SEQUBKES 

94606, '^Corresponding author. 



W€ have enhanced the polymerase chain 
reaction (PC3EI) such that specific DNA 
sequences can be detected witfcout open- 
ing the reaction tube. This enhaiicemfiirt 
requires the addition of ethidium bromide 
(EtBr) to a FCit Since the fluorescence of 
BtBr increases in the presence of double- 
stranded (ds) DNA an increase in fluores- 
cence in such a PGR indicates a positive 
amplification, which can be eiasily moni- 
tored extemaUy. In fact, amplification can 
be continuously monitored in order to 
follow its progress. The ability to simulta- 
neously ampHfy specific BNA sequences 
and detect the product of the amplification 
both simplifies and improves PCR and 
may fsicilitate its automation and more 
widespread use in the clinic or m oth^ 
situations requiriilg high sample ^ugh- 
put 

Although tlic poteiitial b^jncfiw oJVCK' to cU^ 
widely used m this setting, cv^n though u w 
fmif year* tiuao thcrn>oi*iW* PN^ po^r"*^*-- 
asc5* made PCR p^ticsd. Some of r^cfns for it* slow 
Hccepiancc arc Wgh cost, tack of automation of pre. and 
post^PCR processing steps, and fake posibve results froni 
SrTYovCT-contaminatioD. The 1am two points arc related 
in t^t bbor is th« largest cotitributor to cost ait the present 
stage of PCR development. MwC Curretit assa)j rcqui«: 
m>tS c form of "downstreain" ptt>ccs8ing once 
ding h done in order lo determine whether the t^ 
r*NA sequence was present aiwJ has amplified. T^«e 
include DNA hybridi^ W"*, gel ekcteophore^^s wuh or 
without use of rWictk>ndtgestion^;VHPL<?^ 
trlemophorcsU'^ These methods are ^^^^^^JJ^' 
low ihmughpuu and arc difiRcult to automate. The Uura 
point is a&o closcW related to downstream processing. 
The handling of the rCR product in ^^^^^^^^JJ!^^^. 
I processes increases the chances that aiopltfied DNA^wvU 
I spread through the t]rpi»>g *ab, resulting in a iisfc ot 



carryover" fiilse positives in suhseqtiient tesung . 

Ihese downstream processing steps would be e^t|- 
nated tf specific amplification and detection of amphfecci 
DNA took place simultancomly wiihitt an unopened re- 
action vessel Ass5iys in wbkh such different processes tafcc 
place witboin the need to separate rcacuora components 
have been termed ^liomogeneous"- >5o truly *^on»og^ 
tieom PCR assay has been diimortstrated to date, aithougn 
progress towards this end has been reported. Ch^b, et 
developed a PCR produa detection scheirtc usmg 
fluorescent primers that resulted in a fiuor^ent PCR 
product AUblc-fipecific primers, csdh with different Buo- 
feseent tags, wen: used to indiicate the genotype of the 
DNA. However, the anincorporaied pnmcrs ^nust stiu oe 
removed in a downstream process in 
result Recently. HoUahd, et developed J^asw^ in 
iwhtch the endogenous 5' <;x6nudease assay of DNA 
potymerase was exploited to cleave a labcl^^gonuckD- 
lide probe. The probe would only ckave if PCR axnplift- 
cation had produced its complementaiy sequence, in 
order to decect the dcavagc prodwco. however, a subsc- 
qmcnt procesjs is iagain needed. _ 

We have developed a tr^ly homogeneous assay for PGR 
and PGR prodoei detection based upon the greatiy ift- 
creased fluorescence that ethidroin btonltde and other 
DNA binding dyes exhibit when they are bound to^9- 
DNA^^'^. As outhnca in Figure J, a proiotypic PGR 
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RQUBE \ Wndplc of simultaneous amplification and dciedtot). of 
PCR product The cottiponcnuof a PC^conwinh^ Et»r djaf aro 
fluor&cent arehswd-EiBr itself, Et»r boimd to other ssDNA or 
dsDN A . There U a large Bawrcscencc cnhananncat Tjrlicn tjJSr is 
bound to DNA and binding is gicady enhanced whtrn DNA iS 
douhlc-stTandcd, After sufficicat <") tjda of PGR» Ae .net 
incrcaive m dspNA T«iiidts in additional EtBr bonding, and ? net 
inercase in total HuoxEsccncs: 
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ftfiCm 2 Get dcdrophoresis of PCR aLtuplificaition prctducts of ihc 
hxxmxdr m«dcar geno. HtA DQts. i»A<l< in ihc prescccc of 
incrr^mg aznouncs of EtBr (up to 8 M-g/tnl). The preseace of 
EtRr lias no obvious cfleci on iltc ykM or spcdfldty of amplifi- 
ciuoa. 
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nomff } (A) JFluoTe»ce&ce mcasurcnicnub fh>m PCRs tbdt coatain 
0.5 iJbg^m! EiBr and that are spcdfic for V<])rotno$OxiQc repeat 
54»]ueiiocS. Frvc rcpUcatc PGRs. were begun conUlning each of tbc 
DNA» 5p«dfied, M pach htdicblca cyd<, one of the five itplicatc 
f Gks for C3ch DNA -vrds fcthovcO from thcrmocyding and Hs 
fluorescence measured, Uniu of fluoiTMCcnce are ^rt^air. {B> 
UV phoiograpjiy of PCRtuN» (0.5 ml Eppendorf^tylc;, P<»)1>^ 
pylcne mtov<etitri£uffc:tubca) conunnmg reactions, those starts 
ing from t ng male Dn A and control reactions without any DKA^ 
from (A). 



begins with primers that are single-stntDdcd DNA (ss- 
DNA), dNTPs, acd DNA polymerase: An ainount of 
dsDNA containing the target sequence (target DMA) i£ 
also typically present. This attiouat can vary, depenc^g 
on the applicatioD, from tiitgrie-ceU amount^ of DNA^^ to 
microp^ms per PCR^®, If EtBr is present^ the reagents 
that wiU fluoresce, in order of tnctxasing fluorcsccnoe, are 
free EtBr H»clfi and EtBr bound to the singk-fitrandcd 
DNA primett and to tbc doublc^tirtudcd target DNA (by 
its intercahlibn between the stacked bases of the DNA 
doubJc-hcUx)* After the first denatutation cyde« tni^ct 
DNA will be largely £tin|le-stranded. After a PGR ia 
completed! the most signiS^nt change is tbc increase in 
the atnount of dsDNA (the PGR pr<^ua itself) of up to 
several oycrpgr^ms. Formerly free EtBr is bound to the 
addttiotial dsDNA, resulting in an increase in fluores- 
cence. There i^ also some decrease in the amount of 
ssDNA primer, but becauM: tbc binding of EtBr to sfrDNA 
is much \c$!i than to dsDNA, the effect of tbi$ change on 
the total fluorescence of the sample is small. Tbc fluores- 
cence increase can be measured by directing cxcitatiiDa 
illumination thn>ugh the walls of the amplifieation vessel 



before and after, or even a)nijnuously during, thennocy. 
ding. ^ 

RESULTS 

PGR in file presence of EtBr. Ix? order to assess tht 
affect of EtBr in PGR, amplificatious of the hunwi HLA 
DQa gene^' were performed with the dye present a; 
concentrations from 0,06 to 8,0 p^p^ml (a ty^icaj concen- 
cration of EtBr u^ed tn staining of nuclek aods following 
gel electrophoresis is 0.5 M-g^mF), As shown in Figure 2/ge3 
electrophoresis ixveakd liule or no diScicnce ii» the yield 
or quality of the amplificadon product wl^th^r EtBr was 
absent or present at any of these concentrations, indicat- 
ing that EtBr does not inhibit PGR. 

Deteetiou of huootan Y-cliromoscmMs speolic ^ 
cpiGncesu Sequence-specific, fiiK)rcsccnce enhancement of 
EtBr ax a result of PGR was demonseraldd in a scries of 
ampliScatiotis containing 0,5 v^gfml Et&r and ptimen 
spedfic to repeat DMA sequences found on the human 
Y-chromosomc*^. These PGRs inittaliy contained cither 
60 ng male, 60 u% female, 2 ng juak human or no DNA. 
Five replicate PGRs were begun for each DNA* After 
1 7, 2 1 » 24 and 29 cydes of thermocyding, a PGR for each 
DNA was removed from the thermocyder, and its fluo- 
rescence measured in a spcaroSnorometer and ploaed 
vs. ami^ificadon cycle numbCT (Fig. 3A), The shape of this 
curve tcScOs the &ict that by the time an increase in 
fluorescence am be detected, the iricreasc in DNA is 
becoming linear and not exponential with cyde number. 
As shown, the Huforcsocncc iucreased about three-fold 
over the b«id;grouDd fluorescence for the PGR^ c:x>ntain- 
ing himian male DNA, but did not significantly increase 
for ncgadve control PGRs, which contamed either no 
DNA or human female DNA. The more male DNA 
present to begin with— 60 ng vctstis 2 ng-niw fewer 
cycleii were needed to give a detectable increase in fluo- 
rescence. Od electropfiorcsts oc> the products of these 
amplifications showed that DNA fragments of the ex- 
pc<5cd size were made in the male DNA coritaming 
reactions and that litile DN A sEyntbesis took place In the 
control samples. 

itt addition, the increase in. fluorescence was visualized 
by sQcmpdy laying the a»npleted, unopened PGRs on a UV 
transiUuminator and photographing ^'cm through a 
filter. This is shown in figure 5B for the reactions thai 
bc^an with 2 ng male DNA and those with no DNA. 

Detection o«f spedGc alktoa of tins htmian fl-g|lobiD 
gene. In order to demonstrate that this approach has 
adequate spedCidty to allow genetic screening* a detection 
of me ^icklc-cdl anemia rautataoo was performed* F^^tirc 
4 shows the ftuorescencc from completed ampli&tatioi)» 

corttaining EtBr (O.S ^g/tkil) aju dttact6d by photography 

of the reaction cubes on a UV transillominator. These 
reactions were performed using- prixmrxK spedftc (or ci- 
ther the w3d-iype or sickk-celi mutadon of the human 
^loMn gene* . The spcdfvdty for each allele is imparted 
by placsng the sicUe-mutation site at the terminal 3' 
nudecrddc of one primer. By using an appropiiaie primer 
annealing temperature, primer extemio^n— ^nd thus am- 
plifkaljoii^-can take place only if the S' nucleotide of the 
primer is coiu^lcmcntary to the ^-^k3b!n alidc present^' ^. 

£;ach jpair Of ampliifications shown in Figure 4 consists of 
a reaction with either the wildHype allclfe spcdfic (left 
tube) or sickle-allde spedfic (right tube) primet^. Three 
difierent DNAs were typed: DNA from a homozygous, 
wild-type ^giobin individual (A A); from a heterozygoiu 
sickle p^gipbin individual (AS); and from a homozygous 
sickle ^giol>in individual (SS). Each DNA (5t0 ng gex^omK 
DNA to start cAdi PGR) was analyzed in triplicate (3 pain 
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reactions each). The DNA .type vas reflated in the 
^ttvc fiuot<i«c€ncc intensities in eath pair of comfrtftted 
^ItficHtioiis. There was a significant increase in fluorea- 
!cJJoe only where a ^globin aDcle DNA matched the 
SJrimcr »ci. WbcD mcaswrcd pjr a spcctroflworometcr 
Mata not shown)» this fluorescence was about three times 
S»t present in a PGR where both p-globm alleles were 
llijiiti^tcbcd to the primer ^eu Cel ckctrppbotciiWi (poi 
^hown) establisherf that this increase in fluorescence was 
due to the synthesis of nearly a microgram of a DNA 
fr^ifftncnt of the expected sLte for p-globin. There was 
it^ syntbcsifi of dsDNA m reactions in which the afiele- 
-nedfic ptimer was mismatched to both alieks. 

Gontinvov^ roowitorb»g of a PGR. Usiug a fiber optic 
dcviccr it is possible to direct excitation illutsiination from 
J, «,ectronuoromet<r to a PGR undergoing thcmocychng 
and to rctirrn its fluorescence w the Kpcccroftuommeter. 
rhc fitiorcsccncc reiidout of such an arr^gement, di- 
^ctcd ftt an EtBr-concaining amplification of Y-chroroo- 
soroc spcciGc scgwikcs from 25 ng of iiuman male DNA* 
is shtywn in Figure 5. The readout from a control PGR 
wiUi no target DNA is also shown. Thirty cycks of PCR 
vere monitored for each- » i 

The ftuoxcsccncc trace as a function of time dearly 
shows the eifcct of the thennocyding. Fluorescence inten- 
jQiy rises aiMi .£alb invctsdy with temperature The fluo- 
rclcence imenidiy is mimmum at the dcnaturation tein- 
Dcrature (94**C) aad maadmum atthcanncaUng^xteittion 
icmpcratwTX? (SOX). In the negaiive^ontr^ PGR, these 
ftuon:scencc maxima and nfiinima do not change signrtj- 
csintty over the thirty tbcrmocycte, inaicating that there is 
Ihik dsDKA 9ynihe$b without the appropriate target 
DNA, and there is litck if any blc5K:hjngof EtBr during 
the continuous ilJuminanon of the samj^. 

In the PCR co^itaintng male DNA, the fluorcscctwx 
maxima at the annealing^extsension temperature begm to 
incTcaRe al about 4000 seconds of thennocyding, and 
continue to increase with time, indicating that dsDNA is 
being produced at a detectable level. Note that the flwo- 
itaiccnce minima at the denatuxation tcmpctatiire do not 
fiiKtiificandy increase, prenimaMy because ai this temper^ 
aturc there is no d&DN A for EtBr to bind- Uros the course 
of the ampMcation is followed by tracking the ftuorcsr^ 
ccnce increase at the aoncaHn^ temperature. Analysw of 
ihc products of these two amphfications by gel elcctropho- 
rcpis showed a DNA fragment of the caqpectcd size f or Uie 
male DNA containing Sample and tM> detectable DNA ; 
syniheius for the control sample. 

DISCUSSION 

Downstream processes such asi hybridij^tjn ^ a se* 
Qucace^pedfic pit>be can enhance tlie specifiaty of DNA 
dct^uvii hy FCR. Tbc climiiMtkin <rf (iitac proccoaca 
means that the specificity of this homogeneouB assay 
depends solely on that of PGR. In the case of f^^^^^J* 
dLw&Cp wc have fibown that PGR alone has sufficient DNA 
sequence BpedficiLy to permit genetic screening. Using 
appropriiite amplification conditions, there is hltte ncm* 
specific producdon of dsDNA in the abeCJK« of the 
Tippropriatc target ^kle. 

The Rpcdficiiy required to detect pathogens can be 
more or less than that rcf^uired' to do genetic scrccnm& 
depending on the number of pathogens in the sample and 
the amoimt of other DNA that must be taken with the 
sample. A difikult targeft is HIV, which requires detection 
of a vir^ genome that can be at the level of a few copies 
per thoiL^nda of host ccUs*, Compared widi genetic 
acrceninR, which is performed on ceils jiontainiDg at test 
one copyof tJve target sequence. HtV dei^ctipn requires 
both more spedfidty and the input of more total 
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WmA irVphotogmphy of PGR tubes oontainiiif a™iyj<^ 
^ng EtBr ih9t art specific to wiW-trpe (A) or f xAle (S) alldea of 
tlic h-unum ft-globin gene. The left <« «»chrair of tubes contaww 
6lkle*spcdfic ptimers to the wfld-typc afleks, iht ngKi tube 
priuKn to the wcWe attcle- The photoMfaoh taken alter »0 
cycles of PCR, and the input D>fAs anS the aUeles ihcy contain 
9TC ixidkated. Sfty ng of DNA was used to bcpn PGR. Typing 
was done in triptimtc (t pait* oT PCRs) for cadi inpm DNA 
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mmtS C<HitittttOU% rcal^e monitormgof aPCR, Afib^optic 
was o*cd to carry cxatation light to a ft^R m F«>grea and ako 
evmx&d Ucht badt to a fluoromctcr (sec Expgnmcntal 
AmplificaBon iwitUg human njali>DNA spcofic mmcn in a PCR 
swrune with ^ ng of human nude DKA 0<^>t or j« a control 
PCR wthout DNA (bottom), were monnored* Thutv wdes cff 
PCR vcre followed for each, Tlie traipcmim; cycled b^ei 
94*C (dettiiturattcm) and 6a*C Catmcabu^ and cxtcnsjoia). Note m 
the m^e DNA PC», .die cyde (time) depiCT«Jcnt taoeasc in 
fluorescence at anrtftaling^extcDMon iciDpctature, 
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DNA—iip to niicrograDEi amoun©-— iQ order to have suf- 
Bcicnt numbers of target sequences. This large amount of 
."itarting DNA m an ampUficatkwi sigDitjcanrty inacases 
the fc«ck^;tound ftuorescence ovtr vi?hicb any additional 
(luuresoetice produced by ?CR must be detected. An 
additional compHcation that occurs wlih Urgecs in low 
copy-number is the formation of the **primer-dimer" 
artifact This is the result of the extension of one primer 
using the other primer »9 5* tciupSate, Although this occurs 
infrcquentlyj onpt it ocicuTS the extension product is a 
substrate for PGR amplificationf and can compete with 
true PGR targ^ets if those targets are rare. The primfer- 
dimcr produa is of course dsDNA and ibus is a potential 
source of False signal in this homogeneous assay. 

To increase PGR spedfidty and reduce the eBfcct of 
primei^dimcr antplifkatiOTi, we are investigating a num- 
ber of approaches, including the use of ncstcd^primer 
ainpUficaiions dwt take place in a sang^c tube*, and the 
*1iot-5tart*', in which nonspedfic atnpUfkaHon is reduced 
by raising the temperature of the reaction before DNA 
synthesis begins**. Prdiminary resuks using these ap- 
proaches suggest tbatpTjmer-diJoer b effectively reduced 
and it is possible to delect the ittcrease in Etfir 0uorc$- 
cencc in a PGR instigated by a angle HIV genome in a 
background of 10* ctdts. With larger numbcrB of ccHs, the 
background fluorescence contributed by genomic DNA 
becomes problexnatic. To reducx: this background, it n?ay 
be possible to use sequence-spedfic DNA-binding dyci 
that can be made toprcferenoaJly bind PGR product over 
genomic DNA by irtcor|>OT^ting the dye-binding DNA 
sequence into the PCR product throi^h a 5' **add-on" 
the oligonudeotidc primer*"*. 

We nave shown that the detection of fluorescetwe 
generated by an EtBr-containing PGR is straightforward, 
both once PCR is completed and continuously during 
therrttocyding. The ease with which automation of spe- 
dific DNA detection can be accomplished is the niost 
ptotnisiTig aspect of this-assay. The Huorescence analysts 
of compJetcd PGR* is aheadypossiblc with exwtiT^g instru- 
mentation in 9G-well format^. In tliis format* the fluores- 
cence in cadi PCR can be ^uantitated before, after, and 
even ai sciccied points durjng thermocyciiiig by moving 
the rack of PCRs to a 96-microwc!IJ plate fluorescence 

reader*^. ^ . 

The instrumentation necessary to continuously momtor 
multiple PCRs simultaneously is also simple in prindple. 
A direct cTcLcnston of the apparatus used here is to have 
multiple fiberoptics transmit the excitation light and flu- 
orescent emissions to and from muUiple PCRs. The ability 
to monitor multipk PCRs continuously may allow quan- 
titation of target DNA copy number. Figure S shows that 
the larger the amount of starting target DNA, the sooner 
f^iirinf^ PCR a fliiorftseencG increase Is detected. Prdimi- 
nary experiments <Hig:iidii and DoHinger, manuscript in 
preparation) *^ith continuous raonhoring have shown a 
sensitivity to two»fold differences in initiaJ target DNA 
concenttadon, 

Conversclvi if the nutnl)er of target molecules is 
known — as u can be in genetic screening— rcontinuous 
monitoring may provide a means pf detecting faHc posi- 
tive and false ncgattvc results. With a known number of 
target molecules, a true positive would exhibit detectal^ 
fluorescence by a predictable number of cydcs of PGR. 
Increases in fluorescence detected before or after that 
cyde would indicate potential artifacts* False negative 
resuhs due to» for example,. inhibition of DNA po^er- 
ase, may be detected by induding within each PGR an 
inefBdently amplifying marker. t¥u5 marker results In a 
fluoreficcnce increase only after a large number of cy- 
cles—many more than arc necessary to dccea a txue 



P:5^6 



positive. If a samp?c fails to have a fluorescence increaie 
after this many cycles, inhibitton may be suspected. Since, 
in this assay, conclusions are drawn based on the presence i 
or absence of fluorcKJCnoe signal alone, such controls rnay ; 
be important. In any event before any test based on this : 
prindjplc is ready for the clinic, an asscsHDcnt of ttt faUe 
positive/false negative rates will need to be obtained using 
a large number of known samples. 

In svmroaT>-, tbie Inclusion tia PCR of dyes whose fluo- 
rescence is enhanced upon binding dsDNA makes it i 
possible to detcrt spcci6.c DNA amplification from ouisidit 
the PCR tube. In the future, instruments based upon this 
prindple may facilitate the more widespread use of PCR j 
in applications that demand the high throughput of 
sampieS' 




EXPERIMENTAL PROTOCOL 

Hiiman HLA-X>Qn gc»e *mplifi<aaiona cimtainuig EtBr. I 
PCfo wtre set up in 100 }>X vcrfimes containing 10 mM Tris-HCI, 
pH 8.3; 50 mM RCi; 4 sM MgOj: unit* of Taq DNA 
polymerase (P*r1tii*».E)mcr Ccrni. Nonralk* <rr); 20 piriolc each 
of huToan HtA-DQa gene spcdfic oligonuclcodoe primers 
(>H$B and CH27^'' and approjomawly 10* cop«S ol DQ6 PCK 
product diluted from a previous reaction. Ethidium bromide 
(Et&r; SigtttA) WJM wed M the conceatrations Lndtaued b Figort 
2. Thcrmocyding proceeded for 20 cycles in a nwdd 460 ! 
thcTHtocydcr <Pwyti-Elmcf Cctu^ Norwalk, CT) uHn|a Jrtco- 
cycle" program of 94*C for 1 mia-dcoatur^lion and 6(rC for "SO 
sec aninealmg and 72^C for 30 itc. e&iensio». 

Y^iontoMmic spcdSc PGR. PCRs (100 m1 total rcaoion 
volume) contaxiiing 0J> EtBr wrc prepared as described 

for HLA-DQer, except wicx difi'ercnt primers and tsrgti DNAs. 
These PCRs cont^ticd 1 5 pmofc cadi male DNA-*pcciHC p^tole^s 
YI.5 and Vi.2*°, and cither 60 ng male, 00 ogfemale, 2 ng male, 
or no human UNA. Thcrmocyding W*CTor I min- and 50?C 
for 1 min using a "rtcp<yde* program. The nwrnbcr of cycles for ! 
a sample were as tndkaied iu F^gui^ 3. Fluorescence measure- 
inent is dtocnb«d bdow. ^ 

Allek-apccific^ human j^-^loMn ^voe PGR. AmpUticauons of 
too p.\ vt^umc' «smK 0 5 M€/«l of were prcjMircd a* 
described for HlJ^^DQii above except with different prtmcr* 2na 
target DNAs. These PCRs cOtiiaintd ^ihtr, primer pair HGPS^ 
HfHA <wa<HYPe globin spcdEc primers) or HOmipi4S (sick- 
Ic-giobin spcdKc piimers) at 10 pmde ei>ch pnmcr per PCR, 
Tlwse primers were developed by Wu ct aL;^ Three dilfcrent 
Catgei DNAa were oitcd in separate ampttficationa-^O ng eaeU of 
human DNA that was homoEygpus for the *«tklc trail (5S)» DNA 
that was hcteroryrow for the sickle irak (AS), or DNA that 
homozygous for Ac W.l. gloWn (AA). Thcrmocycfing for SO I 
cycles at 94*^ £6r I min. and 55"C tor 1 min. iittUg a "stcp^TCte" 
program. An annejOmg temperature of 55*C b*d been shown tiy 
Wo ci al.** to provide allde-spccifk atpplifiicatxon. Completed 
PCRa were phcrtographcd through a red fitter <Wratien_z3A} 
after placing the reacUon tubes amp a model TM-SB transfflumi- 
nator (UV-products .Sah C^abrid, CA). 

Fhiorescc^measnremem, FluoresoetKC racasairemenw were 
mad« on PCRs containJnfe EtBr in a Fluorolog-2 Ottoromcter 
fSPfiX Edison, NJ). tSxcitatkin was at the &00 nra band with 
ibotir £ nm bandwidth with a GO 43S nm cutoff fikerjMdkj 
Grist: Inc., Irvine. CA) to cxdudc scojnd-order bgfat. Emitted 
yght was detected at 5^0 nm with a bartd«idil> of about 7 nra. An 
CO 530 »m c«t-ofF fihcr was used to remove the ototation h(pst- 

ContitHtooA ftnorancencc moazMtng of FCR, Contmumis | 
monitoring ot a PCR in progress was accomj^isbed usin^ mc 
BDcctrofiuoronieter and sctdnBa dcscrtbod frbovc as well as a 
fiberoptic accessory (SP£X caL no. 1950) io both send cxatajjon 
fight to. and receive emitted Ught from, a PCR pbccd m a w<:ll ™ 
a modd *80 i*iernK>cydcr (Perkm-Dmer Cdus). The probe eno 
of the fiberoptic cable was attached willi "5 nvmutc-cpoxy* to wc 
open top of a PCR tube (a 0^ ltd polypropylene centrifi^c tow 
wth its cap removed) ^^freaively scahng it. The cJtpOSedTto 
the PCR tube and the end of the fiberoptic caWc were sh>c 
from room light and the r00« lights were kept dimmed durmg | 
each nm. The n»oniu>pcd PCR was sua amplHicautfn of V-cbTO- , 
mosome-spcdfic repeat scsMcnccs a-i deflcribcd above, except 
uwng an anncalinc/extension lemperauirc of 50°C. The reaction i 
WU5 <i6v«ted with mii^wal ofl (2 drope) to prevent evaporanon- 
Tbcrroocycfinr and fluorescence measurement were starlccl si- 
multancously. A time-basc scan with a lO second inUgratioiV tjanc 
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IMMUNO BIOLOGICAL LABOf^ATORlES 



sCD-14 EUSA 

Trauma, Shock and Sepsis 




The CO-14 molecule is -expreaseci on the surface of 
monocytes afid some macrophages. Membrane- 
bound CD -14 is a receptor for lipopolysaccharide 
(LPS) complexed to LPS-8inding-Protein (LBP). "me 
cor^cenirailoo of its soluble form is aRered urxier 
certain patlTologtoal conditions. There- is evidence for 
an importani role of sCD-14Avith potytrauma, sepsis, 
burnings and inflammations. 
During septic condifions and acute intecttons il seems 
to be a prognostic mariner and is therefore of value in 
monftoring these patients. 

for more information can or tax. 



IBL offers an ELiSA for quantitative determination of 
soluble CD-14 in human serum, -piasma, cell-cutture 
supernatants and other biological fluids. 
Assay features: 12x8 determinations 
(microfiter strips), 
precoated with a specific 
monoctonai antibody, 
2x1 hour incubation, 
standard range: 3-96 ng/ml 
detectiori limit: 1 ng/ml 
CV: intra- and interassay < 8% 
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Wc have enhanced the polymerase ch^ti 
teacdon (PC3R) such that specific DNA 
sequences can be detected without open- 
ing the reaction tube* This enhaiicemettt 
rMuires the addition of ethi^um feromide 
(EtBr) to a PCR. Since the fluorescence of 
ElBr increases in the presence of double* 
stranded (ds) DNA an increase in fiuorts- 
cence in such a PGR indicates a positive 
ampHfication, which can be eia&ily moni- 
tored extemaUy. In feet, amplification can 
be continuously monitored in order to 
follow its progress. The ability to simuha- 
neously ampHfy specific DNA secpcnces 
and detect the product of the amplification 
both simplifi^ and improves PGR and 
may facilitate its automation and more 
widespread use in the clinic or in oth^ 
situatTons requiriilg high sample Arough- 
put 

Although tbe potential bciKAtfi ofPCR* lo cUu- 
kal <&gnosdc5 well known'''*, it w suU net 
widely used in this setting, though « w 
four years eiuco thcnnQKt*bl« DISIA potym^.*-- 

ast** made PGR rw^ti<al. Some of the r^ons for its slow 
Hoccpuncc arc lugh cost, lank of automation of pre, and 
post-PCR prooe39ing steps, and false posibve 
carryovcKOntamination, The fim two point* arc reiacea 
in tKat Ubor is the largest contributor to cost flit the present 
stage of PGR development. Most Curreiit assa^ require 
some form of "downstream" processing once *«mocy; 
ding is done in order lo determine whether the tM|ct 
I»QA sequence was present and has ampltticd, i ncse 
include DNA hybndbx^oon^*-*, gel ekefcropbortyiii with or 
>^itbout use of rc*tT'iaion digestion^ HFlXf , or cap^ry 
rAeetiophoTesb*^ These methods »rc labornmensc, bare 
low ihrt)ughput, and arc difikrult to autoinatc. The thml 
point is aio ckiscW relattd to downstream processmg. 
The handling of the PGR product in ^J^^SS^^^, 
prooewes increases the chances that arapMed DNA^wvU . 
aprcad through the typitig lab, resulting m a iisk ot 



"carryover'' false positives in subsequent testing . 

Ihese downstream processing steps would be ehnw- 
natcd if specific amplification and detection of amphfacd 
DNA took place simwitaneomly within an unopened re- 
action vessel Assays m whkh sik^ dincrcnt processes take 
place without the need to separate rcacuon components 
have been termed ^'homogeneous". Ko truly homogc- 
tieous PCR assay has been d^onstrated to date, although 
proCTCSs towards this end has been reported. C3i^b, et 
developed a PGR product detection scheme using 
fluorescent primers that resulted in a fiuorj^ent PGR 
pTOduct Allck-^pedfic primers, each with different Buo- 
tes^ent tags, were used to indicate the genotype of ttje 
D>iA. However, the anintwrporaied pnmcrs must suu Oe 
removed in a downstream process in orc^to visuahzx the 
result Recently. HoUand, et a^^^ developed aaassar m 
i^hich the endogenous 6' exti^nudease assay oi T^ DNA 
pdtytnerafie was exploited to ckave a febek^^goQUcteo- 
lide probe, Hxe probe would only ckave if PCR anjpJifi- 
cation had produced its ooropletnentaiy sequence, in 
order to dccect the dcavage products, however, a svbse- 
qucnt proccsjs is iagah^ needed. 

We have developed a truly homogeneous assay for FCR 
and PGR prodnci detectibo based upon the gready in- 
creased fluorescence that ethidroro btonude and od^r 
DNA binding dyes exhibit when diey are bound lo^j- 
DNA^^^^. Ss outUncd in Tigure 1; a pioiotypic PGR 



/ 



fliD^A primcci 




nCVRE 1 Wnciplc of simultancoua ainpIificatKm and dciea'ton of ; 
PCR preducL The component* of a IV-R conwinh^ EiBr ihaf aTC 
flifore«^t are lisied-EiBr itself, EiBr bound ta other ssDNA or 
daDNA. Tbcr« M a large ftuorcscewx cnhanocxnciit vhcn^tJir is 
bound to DNA and iSndihflr is gi^My enhanced i^cn DNA iS 
dtiublc-HTandcd, Ato sufficient <") cjdps of PGR. the net 
mcTcaM bi dsONA rcxidts in addfriooal EtBr binding, and J» net 
incrcnse in total Auorciccnce: 
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it Gel dcctrophoresb of l*Cit aLmplification prodiictB of the 
hunfuui, tiitdcar gcnc» HLA DQto, made m the presence of 
incrcASiog amouDts of EtBr (up to S M-g/tnl). The preseace of 
EU^r tia& bO obvious efifea on cite ^Id or spcciilclty of dtnpliB- 
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nGOXE > (A) HuoTCSCence mcasurcmenu -from PCRs tbdt contain 
0.5 jJtg^m! EiBr and that ate specific for V<])^tno$Oja(x: repeat 
5eqyencc». Five rcidicatc PCRi.^cre begun contAining each of the 
DNA* 5ped&ed. At each mdieaxcd cyde. one of the five repUeatc 
PCRs for cskAi DNA -was rcniovcd from thcrmocyding and tt* 
fluoresccrncc measured, Units of fluorenccnce art ^Tbitranr* (B) 
UV pbOiOgraJ^y of PCRtub«i (0,5 nil EppcndOTf-stylc, pofypiO^ 
pylcne micro-centrifugc ^tubcs) c6nt»Hun^ reftctioiiGf those scaru 
mg £rom ^ ng male DNA MiA control rcacoona without any DHA, 
from (A). 
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begins with primers that are Mnglc-strandcd DNA <»5 
DNA), flNTFs. aod DNA polymerase: An amount of 
dsONA contaittJng the target sequence (target DNA) is 
also typically present. Thj$ sunowat can vary, depending 
on the application, from singrle-cell amounts of DNA*^ to 
micrograms per PCR^*^, If EtBr is present, the reagents 
that >vTu fluoresce, in order of incnwusing fluorescence, are 
EtBr itself, and EtBr bound to the stngk-cttanded 
DNA prinsen and to the doublc^traoded target DNA (by 
its tatercaJadon between the stacked bases of the DNA 
doi^blc-hcfix). After the first denattJtation cyde, target 
DNA will be largely $ingic-stranded. After a PGR ia 
oompletcdt die most significant change i$ the increase in 
the amount of dsDN A (the PGR produa itself) of up to 
several tmcrpgr^ms. Formerly free EtBr is bound to the 
additional dsDNA* resulting in an increase in fluprcs- 
ccnoc. There is also some decrease in the amoviQl of 
ssDNA primer, but because the bindnig of EtBr to ssDN A 
ts much less than to dsDNA, che effect of this chsoige on 
the total fluoreKcn.cc of the sample is smalL The Dtiorcs- 
cencie increase can be measured by directing cxdtation 
lUuminaLion thmugh the walls of the amplifieadop. vessel 
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bctbrc and after, or even ajniinuously durinjj, thermocy- 
ding. 

RESULTS 

PGR in the presence of EtBr, order to stsset^s the 
affect of EtBr to PGR, amplifecalious of the human Hlj< 
DQa g^ne**^ uere performed with the dye present at 
concentrations from 0,06 to 8.0 pi^ml (a ty^i^ concen- 
tration of EtBr a<ie<l in staining of nucleic aads following 
gel elcarophoresis b 0.5 ji-g^mf). As shown in Figure 2, 
eleefropHoTcsis ix^vealed little or no difference in the yield 
or quality of dbic ampiifscadon product whether EtBr was 
absent or present at any of these concentratkms, indicate 
ing diat EtBr does not inhibit PGR, 

De t ec ti o n of faoman Y-dttomosoino specijfie ^ 
{jDGnces. Sequence-specific, fli^orcsccnce enhancement of 
EtBr as a ixsvix of PGR was demonstrated in a scries of 
amplifications containing 0,5 tig^ml EtBr and pHroers 
specific to repeat P>?A sequences found on the huioaa 
V-chromosomo***. These PCRs initially contamed citlier 
60 ng m^de, 60 ivg iemale, 2 ng mak human or no DNA. 
Five replicate PCRs were begun for 6ach DNA* After 9, 
17, 21 , 24 and 29 cydes of thcrmocyding, a PGR for cadi 
DNA H-as removed from the theiTOOcyden and its fluo- 
rescence measured in a spcctroflnorotweter and plotted 
vs. amplificadon cyde number (1^. 3A), The shape of this 
curve rejects the fact that by the time an increase in 
fluorescence can bt: dctmed^ the intrcasc in DNA is 
becoming linear and not exponential with cyde number; 
As shown, the fluorcMxnoc increased aboujt three-fold 
over the background fluotesdettce for the PCRs tJontain- 
itJig human male DNA, but did not signifkanlJy increase 
for T^cgative control PCRs, which cont^ned either no 
DNA or hufiian female DNA, The more maJe DNA 
present to begin with— 60 ng vctsu* 2 ng— the fewer 
cydes were needed to give a detectable increase in fluo- 
rescence. Cd dectroph«M-est$ the products of these 
amplifications showed that DNA frafpnenta of the ex- 
pected Kic were made in the mak DNA containing 
reactions and that Me DNA syotheisas took place in the 
control sam^es, 

Iti addition, the increase in fluorescence vt^as visualized 
by amply laying the awnpleied* unopened PCRs on a VV 
transilhiminator and photogn»pbing ^cm through a red 
filter. This is shown in figure 5B tor the reacdoas that 
began with 2 ng male DNA and those with no DNA. 

Detection <tf specific aWcto of the htiman p-globin 
gene. In order to demonstrate that diis approach has 
adequate spedfidty to allow genetic screening* a dttcciion 
of the ^kfc-cdi anemia mutation was performed* Fi^e 
4 shows the ftuotesccncc from completed axaiMcatioDJ 

contaixiiag EtBr (0.5 Kgf'*^) ^ dcteetid by photography 

of the reaction tubes on a UV transiUuminaior. These 
reactions were performed uAin^ pritacni spedfic for ei- 
ther the wild-t}pe or rickk-oeil muiadon of the humM 
^lobin gene*\ The spedfidty for each allele h imparted 
by placing the sidje-mutation site at the terminal 3 
nucfeoddc of one ptimcr. By using an appropwte primer 
annealif^ temperatiiret primer e^ctettsiOin— and thus am- 
plificati9rr-<an uke place only if the S' nudco6de of the 
primtr i$ co inplcmcntaiy to the p-^kjbin aUdc preftcy>t • 
Each pair aiupliifications shown in Figure 4 consists <^ 
a reaction with eidticr the wiKJ-type allclfc specific Qcti 
tube) sicklc-aUele spedfic (right tube) primes. Three 
diCTerent DNAa were typed: DNA from a homozygous. 
wH<3-typc p-globin individual (AA); from a heterozygOUJt 
sicitlc p-globin individual (AS); and from a homoscygou* 
sickle p-gioWn individual (SS). Each DNA (50 ng genomic 
DNA to start cAdi PGR) was analyzed in triplicate (3 p»i» 
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reactions each). The DNA .type vas reflected m the ' 
llladve fluotescetice inlcniaiies in «ach pair of ccmple^cd 
^lificatioiwp. There was a significant increase in fluores* 
ttTioe only where a p^globin allele DNA matched the 
SrucT Jict. WbcD measuxcd oa a spcctrofltioromctcr 
Main not shown), this fljuorcsccncc v^as about three ttxncs 
ihsit present in a FCR where both p-giobm alleles were 
rtitfioatchcd Ro the pnmet set. Gel clcem>pborc?>« (ttoi 
ahown) estaWishcd that this increase in fluorescence was 
f!«e to the syntheas of nearly a microgram o( a DNA 
f^frmcnt of the expected size for p^lobin. There was 
lit^ synthesift of daDNA in reactions in. which the allele- 
«)Qdfic primer was mismatched to both alleles* 
^ Continvoiij? vnwitoiiog of a PCR. Usiag a fiber optk 
devkrcrH i* posRtble to direct excitation Ulutninaiioii from 
J, spectrofluorometer to a PGR undergoing thcrmocyclit^ 
and to return its jRuorcsccnoc to the Rpectroftuorotwcter. 
lie fltiorcsccncc readout of such an arrangement, di- 
,;t:ctcd at an EtBr-concaining amplificadon ofY-chromo- 
SQfnc spcd&c sequences fro^n 25 ng of Wman male DNA^ 
is shown in Figure 5. The readout from a contn^ 
wiUi no target DNA b also shown. Thirty cydes of PGR 
^ere toonitored for each. ^ ^ ^ % 

The ftuorcsccnrc trace as a funcuon of ume dearly 
show5 the effect of die thermocyding. Fluorescence intpn- 
(dtv nses and.fjsils inYCrady with temperature. The fluo- 
tocencc intenfiiiy is minimum at the denaturation tem- 
perature (94**C) and raaadmuin at die anneaUng/extei^on 
tcin|>cratwTx: (SOX). In the negative-control PGR, these 
fluoraccncc maxima and minima do not change signin- 
ontly over the thirty tbcmocyckfi, indicating ihart there is 
litik dBDKA ^theds without the appropriate target 
DNA, and there is little if any We^wrhingof EtBr dunng 
tiKe continuous alhiniinatton of the sam{»e. 

In the PGR cojitatnbcig male DNA, the fltsorcsofncc 
maxima at the annealing/extsension temperature begin to 
incrcafie at about 4000 seconds of thermocydmg, and 
continue to increase with time, indicating that dsDNA is 
being produced at a detectable levcL Note that the flwo- 
naiccnce jnintma at the denawatiou tempctature do not 
aii«uficandy increase, presumably because at this temper^ 
aturc there is no dsDN A for EtBr to bind. Urns the course 
of the amplification is followed by tracking the ftuorcs^ 
ccnoe increase at the aancaHng temperature. Analysis of 
ihc products of these two amplificadonsby gel clectroplio- 
sxsis showed a DNA fragment of the ciqpectcd siie f or Uie 
male DNA containing sample and no detectaWc DNA ; 
synUiejus for the control sample. 

DISCUSSION ^ . 

Downstream processes such as hybndi7.ation "J a se- 
Qwcnce-Apedfic probe can enhance die specifiaty of DNA 
dcc^iIvD by PGR. The cliiiiiiMtkici <y€ (JicAC proccTOca- 
means dtat' the spedfidiy of this homogeneous assay 
depends solely on dial of PCR. In the case of f^Ue^eU 
di-Jwisc, we have shown that PGR alone has sufficient DNA 
flCQuence Hpedfidty to permit genetic screening. Using 
appropriate amplificatioii conditions, there is bttJe 
specific producdon of dsDNA in the absence of the 
Tippropriatc target allele. 

The specificity required to detect pathogens can be 
mw or less than that required to do geiietic screening 
dcpcodine on the number of pathogens m the samijle and 
the amount of other DNA chat must be taken ynth die 
sample. A difficult torget is HIV, which requires detection 
of a vixaJ genome that can be at die level of a few copies 
per thoiL^nda of host cells*. Compared vfith genetK 
scrceninR, which is performed on cdls srontainmg at least 
one cvpy of tlie target scquertce. HIV deiecdon requirics 
both more spetifioty atwi the input of more total 
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UV photogr^phv of PGR tubc&contaiTung iunpU6caiions 
using EtBr ifKt art spccJfec to viM^typc (A) or skAIc (S> a«d« cf 
the fiunran ^-dobin gcnc. The left of eiiCh lair of tube* contaww 
^Bde-Miecffic'wim^ to die wild-type afides, die right tube 
priaicS to the sicWe flJtWe. The bhmoMfa;^ ^ 
cycles of PGR. and the input DtiM ana tfce aUeles^ witetn 
«re indicated, l^lfty ng of DNA wasjf^ ^ ? PGR. Typmg 
was done in triplicBtx: <5 pattt of VC^i) for cadi mpm DNA; 
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flWttS Contittttous, rcal^e monitor™ of a ^R^fib^^ 
was o«rd to carry, cxdtauon light to a ^^^F^^^^J^ 

5[^caSon «tios human ^''l^J^Ni^^^r^ !" » 
smSaz with to ng of hwiTfcan male DNA {i<^h^. J» * coatr^ 
PGR wdiout DNA {bottf)m), were moniiorcd. Thmv q^des of 
PGR were folW^d for each. Hie teBipcr^im: cycled b^tn 
94*C (dctuturation) and 50*C (annealing and cxtcasjon). Note m 
the maJe DNA PO^.die cyde (dmc) deptt«Jeot mttrasc m 
ftuoresociK^c at ttie anneafing/cxtensiffli tempctature. 
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DNA — lip to microgram aniounts^-in order to have suf- 
ficient numbers of target sequences. This large amount of 
.^taxtitig D^A iti an ampBficatikm signitycantly increases 
the tecKgrourtci fluotie*cence over whidb any uddmonal 
fluorescence produced by PGR must be detected. An 
additiona] compKcation that occurs with targets m low 
copy-number h the formadon of the **pdraer-dimer" 
anifaci. This is the irr^wlt of the cxtensioii of one primer 
using the other primer a tcroplatc. Although thb occurs 
infrequently^ once it occurs the extension produa is a 
substrate for PCk amplification^ ^d can compctte with 
true PGR targets if those targets are rare, fhe primer- 
dimt;r produa is of course dsDNA ^nd ihvs U a p^cntial 
Eoutte of false signal in this homogeneous a^ay, 

To increase FCR spedfidcy zm reduce the effect of 
primer^dimcr anlplifkadon, we are invesdgatipg a num- 
ber of approaches, including the use of nested-primer 
ainpU^cauons dtat take place in a san^c tube*, and the 
'1iot-5tart**» in which nonspecific ampliation w reduced 
by raisiDg the temperature of the reaction before DNA 
synthesis begins**. Prdiuiinary results using these ap- 
pTpaches sug«cfit thatpirwncr-diroer is effectively reduced 
and it is possible to delect the incriease in EtBr 0uote$- 
cencc in a PGR instigated by a single HIV genome b a 
background of 10* ccdts. With larger nutnbciB of ccHs, the 
background fluorescence contributed by genomic DNA 
becomes problematic. To reduce this backgrouDd, it may 
be possiMe to use sequence-^spedfic DNA*binding dyei 
that can be made to prcfcrentiaJly bind PGR product over 
genomic DNA by incorporaung the dye-binding DNA 
sequence into the PGR product through a 5' **add-^n" to 
the oUgoni>c]eotidc primer*'*. 

We nave shown that the detection of flucrescetKC 
generated by an EtBr-containing PGR is straightforward, 
both once PGR is cotnpkted and continuously during 
thermocycling. The ease with whkh automatioQ of spe- 
cifk DNA detection can be accomplished is the inost 
promising aspect of this assay. Hie Huorescence analysis 
of completed PClRs is already possiWc with cxi'stii^ instru- 
mentation in 96-weIl forrnat**. In tlus format* the fluorcs* 
ccnce in each PGR can be ^wntitated before, after, and 
even at selected points durmg thcrraocyciiiig by moving 
the rack of PCRs to a 9€>7i«icrowcn plate fluorescence 
reader^**, 

Tlic instrumentation necessary to continuously mo^rtor 
multiple PCRs simuhancoufily is also simple in principle. 
A direct extension of the apparatus used here is to have 
tnulUple fiberopdcs transmit the c^scitalion light and flu- 
orescent emissions to and from tnuUiple PCRs. The ability 
to monitor multiple PCRs continuously may allow quan- 
titation of target DNA copy number. Figure S shows that 
the larger the amount of starting target DNA, the scKxner 
rliirin^ VCiR a fluorescence increase detected. PrcJinii- 
nary experiments <Higiichi and Dollinger, manuscript in 
preparation) ^ith continuous monitoring have shown a 
$cnsitivity to two^fold differences in initiaJ target DNA 
concenttadon. 

Gonvcrsclvi if the nutnber of target molecules is 
Vnpwn — as n can be in genetic screcning-Teontinuous 
monitoring may provide a means pf detecting fabc posi- 
tive and false ocga»vtt result*. With a known number of 
target molecules, a true poadve would esdiibit detectable 
fluorescence by a predictable nun^bcr of cycles of PGR. 
Increases in fluorescence detected before or aifter that 
cycle would indicate potential artifacts* False upgativc 
refiulis due to, for example*. iniiibition of DNA poller- 
ase, may be detected by including within each PGR an 
inefTidendy ampPifyix»g marker, 'DNis marker rcAults in a 
AuorefiGez>Ge increase only after a large number of cy- 
cles — many more than arc necessary to detect a true 
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positive. If a sample fails to have a fluorescence incrcaie 
alter this many cycles, iohibitton may be suspected. Since, 
in this assay, conclusions are drawn based on die presence 
or absebce of fluorcwnce signal alone, such controls rnay 
be important. In any event, before any test ba$cd on this 
principle is ready for the clinkv an assessment of tti false 
positive/false negadve rates will need to be obtained using 
a large number of known satnjf^es. 

In i^vmmar>, th<j inclusion tn PGR of dyes whose fluo- 
rescence is enhanced upon binding dsDNA makes it 
possible to detca spcci&c DNA amplification from oumd^ 
the PGR tube. In the future, instruments based upon this 
principle may facilitate the more widespread use of PGR 
in applications that demand the h^h tiux5ughput of 
sample^' 

E3CPERMENTAL PROTOCOL 

Hiuoan HLA-DQn gent *mpHBcatk)»a cnntainiiig EtRr. 
PCRs were up in lOO v(rfomes confining 10 mM Tris-HQ, 
pH 8.3; 50 mM KCi; 4 nJsi MgOit tS units of Too DNA 
potymCf^SC (PerkiiwElntcr Ocm*. Norwalk, CTT); 20 pniote each 
of human HlA-DQa gene spcdfic oligonuclcodde pritacrs 
<j«SB and GHS7^' and apprajijniaieljr W cop*cs of DQa PCk 
product diluted frOOi a previous i-eiwytion- Ethidium bromide 
(E(^t; SigtttA) was wed at the oynceamitionB indicated io FigurQ 
2. Thcrmocyding proceeded for 20 cycles in a model 4«iO 
thcrroocyder (PerkJn-Elmcr Ccxm, Norwatk, CT) u*in| a "rtcp- 
cydc" program of 94X for 1 niin^ dcnatuTatiOTi and WrG WW 
sec anncSmg and 72'C far 30 sec. exiensioo. 

Y^romwomc spcdfic PCR* PCRs (JOO yl total r<;Bction 
TOluxne) containing EtBr wtc prepared as dcsaibcd 

For IttA-DQcf, excq>t vffsx diflcrcnt primers and target DNAs. 
These KTRs contt^^tincd 1 5 pmoic each male DN A*specific prtniet:» 
YI. 1 and V 1.2", and cither 60 ng nnilc, 00 ngfemale, 2 ng njak. 
or no human JONA. Thcrmocyding was W*CTor 1 nun- and 60^ 
for 1 min using a "rtcp-cyde** p rogram. The number of cycles for 
a sampie were as indicated in Figui-e 3. ttuorcsccnce measuro 
ment sft dAsct)b«d below. ^ ^ 

AUck-spccsac^ human ^giobin pew* PCR. Amptificauons of 
100 fLl vc^unw wsing 0-5 Mfi^ml of ZtBr were prCjMtrcd aj 
described for HLA-DQ* above except Hith different pfinXT* ano 
target DNAb. These PCRs contained eiiW. primer nair HGPS/ 
RdHA <wB<Hype globin speciEc primers) or HOP2/ilpl4S (sidc- 
Ic-dobin spcafic primers) at 10 pmo)e ^ijich psimcr per FCR, 
TlStse primer? were developed by Wu ct aL-\ Three dtlfcrenl 
atgei An A.t were tiJtcd in separate amplificadona-^0 ng cacU of 
human DNA that was bomorygpus for the *icklc trait (SS), DMA 
that was hetemzygous for the sickle trak <AS>» or DNA that vrtu 
homOAygOMS for the W.t- gJobin (AA). Thcrmocyc&ng wa* fw 3C[ 
cycles at 94'G for I mba. and S5*C tor 1 min. iiftttg a "atep^yiife 
program. An annesOmg temperature of 55X b»dhccn .dtown in 
Wu ei al.*' to provide allcle^pccifk awplitouon* Complctea 
PCRs were phcrtographcd through a red filter <Wratierr23A) 
after pladns ^ rca^non i«bej asop a model TM-S6 transmutjAi- 
nator 

Fhiw«ft<«nce wieaanremcnt. Flwotwceocc mcasureroenw were 
mad^ on PCRs contaSninfe EtBr in a Fluorolog'2 aooromCter 
(SPEX. Edison^ NJ). Excitation was at the &00 nJ5,»>and with 
il«ur 2 nm bandwiddi with ^QO 4,HS nm *^i-<>ff:5i^i™S 
GrisL Inc.. Irrine. CA) to exclude second-order Ugfat Emntea 
Ughi was detected at 5^0 nm widi a batid«idil) of about 7 nm. An 
OC 530 tjm cut-off lifter Was used to remove the cxataiion hpM- 
CandtHtouft ftoorejtcence mooztoring of PGR, Conttnurm* 
monitoring olP a PGR in prOfir«« accomplisbed using 
spcctJOfluoromeicr and setdngs dcscrrbod above as well as a 
fiberoptic accessory (SP£X cat no. 1950) lO both send cxcit3U0n 
iight ta. and receive emitted light from, a PCR Jihaai m a weU ol 
a model -WO ttiemK>cydcr (Ptrkin-Elmer C«tus). The probe e^^ 
Of the fiberoptic cable was attached wiili "5 nwnutc-cpoxy' to w 
open top of a PCR tube (a 0.5 ml polypropylene centrift^gc tu oe 
wdi its cap removed) effeawely Acahng The exposed top 
ih<: PCR tul>c and the end of the fiberoptic owe were sliicWca 
from room Ught and the rOO« ljght$ were kept dimmed dvrmg 
each run. The rnomiorcd PCR was an amplHtcauein of V-cbro- 
roosdme^pcdfic repeat seqMetoce* as descnbcd above, except 
ufling an anneaiingyextension lemperauirc of The reacuc^n 
was coveted with jotukc^J <mI (2 drops) to prwtti evaporanon* 
Tliennocyding and flucrcsocncc measurement were started si- 
multaneously- A time-base sc»ti with a lO second integraboili tunc 
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^•rts uwJCj and ihc ^m)i«oa S'^nal was ratioed to tba witstiofD 
fliRiinl u> ccmtroi for ch»njp» in Jight-sourcc mtcnitity. Oat^ wcre 
fleeted using the droSwOf* vewion S,5 (SVEX) cbta syslcm- 
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IMMUNO BIOLOGICAL LABORATORIES 



SCD-14EUSA 

Trauma, Shock and Sepsis 




The CD- 14 molecule is' expressed on the surface of 
monocytes and some macrophages. Membrane- 
bound CD-14 is a receptor for lipopolysaccharide 
(LPS) complexed to LPS-Bintiing-Protein (l£P). The 
conceniraUoo of its soluble form is aftered under 
certain patlTological conditions. There, is evidence for 
an important rde of $00-14. with pofytrauma, sepsis, 
burnings and inflammations. 
During septic conditions and acute infections it seems 
to i3e a prognostjc marker arKi is therefore of value in 
monftorlng these patients. 

. FgrmoreinforrnatfoncaMortax, 



IBL offers an EUSA for quantitative determination of 
soluble CD-14 in hurron serum, -piasma, cellncutture 
supernatants and ofrer biological fluids. 
Assay features: 12x8 determinations 
(microSter strips), 
precoated with a specific 
monoctonai antibody, 
2x1 hour incubation, 
standard range: 3-96 ng/ml 
detectiori limit: 1 ng/ml 
CV: intra- and interassay < 8% 
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Oligonucleotides with Fluorescent Dyes at 
Opposite Ends Provide a Quenched Probe 
System Useful for Detecting PCR Product 
and Nucleic Acid Hybridization 

Kenneth J. Livak, Susan J.A. Flood, Jeffrey Marmaro, William Ciusti; and Karin Deetz 

Perkin-Elmer, Applied Biosystems Division, Foster City, California 94404 



Tlie 5' nucleate PCR assay detects the 
accumulation of specific PCR product 
by hybridization and cleavage of a 
double-labeled fluorogenlc^ probe 
during the amplification reaction. 
The probe Is an oligonucleotide with 
both a reporter fluorescent dye and a 
quencher dye attached. An Increase 
In reporter fluorescence intensity In- 
dicates that the probe has hybridized 
to the target PCR product and has 
been cleaved by the 5' ^3' nucle- 
olytic activity of Taq DNA polymerase, 
in this study, probes with the 
quencher dye attached to an internal 
nucleotide were compared with 
probes with the quencher dye at- 
tached to the 3'-end nucleotide. In all 
cases, the reporter dye was attached 
to the 5' end. All Intact probes 
showed quenching of the reporter 
fluorescence. In general, probes with 
the quencher dye attached to the 3'- 
end nucleotide exhibited a larger sig- 
nal In tiie 5' nuclease PCR assay than 
the Internally labeled probes. It Is 
proposed that the larger signal Is 
caused by Increased likelihood of 
cleavage by Taq DNA polymerase 
when the probe Is hybridized to a 
template strand during PCR. Probes 
with the quencher dye attached to 
the 3 '-end nucleotide also exhibited 
an increase In reporter fluorescence 
Intensity when hybridized to a com- 
plementary strand. Thus, oligonucle- 
otides with reporter and quencher 
dyes attached at opposite ends can 
be used as homogeneous hybridiza- 
tion probes. 



homogeneous assay for detecting 
the accumulation of specific PCR prod- 
uct that uses a double-labeled fluoro- 
genic probe was described by Lee et al/*^ 
The assay exploits the 5' 3' nucle- 
olytic activity of Taq DNA poly- 
merase^^'^^ and is diagramed in Figure 1. 
The fluorogenic probe consists of an oli- 
gonucleotide with a reporter fluorescent 
dye, such as a fluorescein, attached to 
the 5' end; and a quencher dye, such as a 
rhodamine, attached internally. When 
the fluorescein is excited by irradiation, 
its fluorescent emission will be 
quenched if the rhodamine is close 
enough to be excited through the pro- 
cess of fluorescence energy transfer 
(FET).^*'^> During PCR, if the probe is hy- 
bridized to a template strand, Taq DNA 
polymerase will cleave the probe be- 
cause of its inherent S' 3' nucleolytlc 
activity. If the cleavage occurs between 
the fluorescein and rhodamine dyes, it 
causes an Increase in fluorescein fluores- 
cence intensity because the fluorescein 
is no longer quenched. The increase in 
fluorescein fluorescence intensity indi- 
cates that the probe-specific PCR product 
has been generated. Thus, PET between a 
reporter dye and a quencher dye is criti- 
cal to the performance of the probe in 
the 5' nuclease PCR assay. 

Quenching is completely dependent 
on the physical proximity of the two 
dyes/*^ Because of this, it has been as- 
sumed that the quencher dye must be 
attached near the 5' end. Surprisingly, 
we have found that attaching a rho- 
damine dye at the 3' end of a probe 
still provides adequate quenching for 
the probe to perform in the 5' nuclease 



PCR assay. Furthermore, cleavage of this 
type of probe is not required to achieve 
some reduction in quenching. Oligonu- 
cleotides with a reporter dye on the 5' 
end and a quencher dye on the 3' end 
exhibit a much higher reporter fluores- 
cence when double-stranded as com- 
pared with single-stranded. This should 
make it possible to use this type of dou- 
ble-labeled probe for homogeneous de- 
tection of nucleic acid hybridization. 

MATERIALS AND METHODS 
Oligonucleotides 

Table 1 shows the nucleotide sequence 
of the oligonucleotides used in this 
study. Linker arm nucleotide (LAN) 
phosphoramidite was obtained from 
Glen Research. The standard DNA phos- 
phoramidites, 6-carboxyfluoresceiri (6- 
FAM) phosphoramidite, 6-carbox)rtet- 
ramethylrhodamine sucdnimidyl ester 
(TAMRA NHS ester), and Phosphalink 
for attaching a 3'-blocklng phosphate, 
were obtained from Perkin-Elmer, Ap- 
plied Biosystems Division. Oligonucle- 
otide synthesis was performed using an 
ABI model 394 DNA synthesizer (Applied 
Biosystems). Primer and complement 
oligonucleotides were purified using 
Oligo Purification Cartridges (Applied 
Biosystems). Double-labeled probes were 
synthesized with 6-FAM-labeIed phos- 
phoramidite at the 5' end, LAN replacing 
one of the T's in the sequence, and Phos- 
phalink at the 3' end. Following de- 
protection and ethanol precipitation, 
TAMRA NHS ester was coupled to the 
LAN-containing oligonucleotide in 2S0 
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FIGURE ^ Diagram of 5' nuclease assay. Stepwise repiesenuUon of the - ^' J^^^^'^lcl 



niM Na-bicaibonate buffer (pH 9.0) at 
room temperature. Unreacted dye was 
removed by passage over a PD-10 Sepha- 
dex column. Finally, the double-labeled 
probe was purified by preparative high- 
performance liquid chromatography 
(HPLC) using an Aquapore Cg 220x4.6- 
mm column with y-jun particle size. The 
column was developed with a 24-min 
linear gradient of 8-20% acetonitrile in 
0.1 M TEAA (triethylamine acetate). 
Probes are named by designating the se- 
quence from Table 1 and the position of 
the lAN-TAMRA moiety. For example, 
probe Al-7 has sequence Al with LAN- 
TAMRA at nucleotide position 7 from the 
5' end. 



PGR Systems 

Ail PGR amplificaUons were performed 
In the Perkin-Elmer GeneAmp PGR Sys- 
tem 9600 using 50-jjlI reactions that con- 
tained 10 mw Tris-HGl (pH 8.3), 50 mw 
KCl 200 jiM dATP, 200 jiw dGTP, 200 jiM 
dCTP. 400 itM dUTP, 0.5 unit of AmpEr- 
ase uracil N-glycosylase (Perkin-Elmer), 
and 1.25 unit of AmpHTaq DNA poly- 
merase (Perkin-Elmer). A 295-bp seg- 
ment from exon 3 of the human p-actin 



gene (nucleotides 2141-2435 in the se- 
quence of Nakajlma-lljima et al.)^^^ was 
amplified using primers AFP and ARP 
(Table 1), which are modified slightly 
from those of du Breuil et al.^** Actin am- 
plification reactions contained 4 mM 
MgClzr 20 ng of human genomic DNA, 
SO nM Al or A3 probe, and 300 nM each 



primer. The thermal regimen was SO^C 
(2 min), 95"C (10 min), 40 cycles of 9SX 
(20 sec), 60°C (1 min), and hold at 72X. 
A 515-bp segment was amplified from a 
plasmld that consists of a segment of X 
DNA (nucleotides 32,220-32,747) In- 
serted in the Smal site of vector pUC119. 
lliese reactions contained 3.5 mM 
MgCla, 1 ng of plasmid DNA, 50 nM P2 or 
P5 probe, 200 nM primer F119, and 200 
nM primer R119. The thermal regimen 
was 50^C (2 min), 95*C (10 min), 25 cy- 
cles of 95X (20 sec), srG (I min), and 
hold at 72X. 



Fluorescence Detection 

For each ampUflcation reaction, a 40-^,1 
aliquot of a sample was transferred to an 
individual well of a white, 96-well mlcro- 
tlter plate (Perkin-Elmer). Ruorescence 
was measured on the Perkin-Elmer Taq- 
Man LS-SOB System, which consists of a 
luminescence spectrometer with plate 
reader assembly, a 485-nm excitation fU- 
ter, and a 515-nm emission filter. Exciu- 
tio'n was at 488 nm using a 5-nm slit 
width. Emission was measured at 518 
nm for 6-FAM (the reporter or R value) 
and 582 nm for TAMRA (the quencher or 
Q value) using a lO-nm slit width. To 
determine the increase in reporter emis- 
sion that is caused by cleavage of the 
probe during PGR, three normalizations 
are applied to the raw emission data. 
First, emission intensity of a buffer blank 
is subtracted for each wavelength. Sec- 
ond, emission intensity of the reporter is 



TABLE 1 Sequences of Oligonucleotides 



Name 



Type 



Sequence 



ACCCACAGGAACTGATCACCACrC 
ATGTCGCGTrGCGGCTGAGGTTCrGC 
TCGCATTACrGATGGri'GCCAAGCAGTl) 
GTACTGGTTGGCAACGATCAGTAATGCGATG 

CGGA'ITTGCTGGTATCrATGAGAAGGATp 
TTCATGCrrGTCATAGATACCAGGAAATCCG 

TCACCCACAGTGTGCGCATCTACGA 
GAGCGGAACCGClXIATrGCCAATGG 
ATGCCCICCCCCATGCCATCCTGCGlP 
AGACGCAGGATGGGATGGGGGAGGGGATAC 

CGCCCrGGACrrCGAGGAAGAGATp 
CCATGfcrrGCTCGAAGTCCAGGGCGAC 

rtav-. r 

each oligonucleotide used in this study, the nucleic add sequence is ;;[';"^"^^^^ 
5' 1 3' d^^^^^ There are three types of oligonucleotides: PGR pnmer, «"<^;,^Kf;^^^^^^ 
n me 5' nSlease assay, and complen^ent used to ^^^^^^^^^ 
probes, the underlined base indicates a position where IAN with TAMRA attactiea was 
tuted for a T. (p) The presence of a 3' phosphate on each probe. 
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primer 
probe 

complement 
probe 

complement 
primer 
primer 
probe 

complement 
prot)e 

complement 



358 PC« MHhods and Applications 

PACE 3/7 • RCVD AT 10/24/2005 5:55:20 PM pacific Daylight TImel • SVR:SVCS01/0 " DNIS:8034 • CSID:(613) 991-5695 ' DURATION (mm-ss):05-46 



From (613) 991-5695 



Order 



1209DP04816741 Mon 24 Oct 2005 08:, 



m 



PM EOT Page 4 of 7 

UIIIIResearch 



A1-2 RAQGCCCTCCCCCATGCCATCCrC5CX3Tp 

A1 -7 RatgcccQcccccatgccatcctgcgtp 

A1-14 RatgcccicccccaQgccatcctgcgtp 

AM 9 IUtgccctcccccatgccaQcctcscgtp 

A1-22 IUtgccctcccccatgccatccQgcgtp 

A1-26 RATGCCCTCCCCCATGCCATCCTGCGto 



Probe 


518 nm 


582 nm 


RQ- 


RQ* 


ARQ 




no temp. 


•1- tern I). 


rtotetnp. 


•f temp. 








A1-2 


25.5 ±2.1 


32.7 ±1.9 


38.213.0 


38.2 ±2.0 


0.6710.01 


0.86 ±0.06 


0.1910.06 


A1-7 


53.5 ±4.3 


395.1 ±21.4 


108.5 ±6.3 


110.315.3 


0.49 ±0.03 


3.58 ±0.17 


3.0910.18 


A1-14 


127.0±4.9 


403.5 ±19.1 


109.715.3 


93.116.3 


1.1610.02 


4.34 ±0.15 


3.18 ±0.15 


A1-19 


187.5117.9 


422.7 ±7.7 


70.317.4 


73.0 ±2.8 


2.6710.05 


5.8010.15 


3.1310.16 


A1-22 


224.6 19.4 


482.2 143.6 


100.0 ±4.0 


96.219.6 


2,25 ±0.03 


5.0210.11 


2.7710.12 


A1-26 


160.2 ±8.9 


454.1 ±18.4 


93.1 ±5.4 


90.7 ±32 


1.7210.02 


5.0110.08 


3.2910.08 



FIGURE 2 Results of 5' nuclease assay comparing p-actin probes with TAMRA at different nucle- 
otide positions. As described in Materials and Methods, PGR amplifications containing the in- 
dicated probes were performed, and the fluorescence emission was measured at 518 and 582 nm. 
Reported values are the average±l s.D. for six reactions run without added template (no temp.) 
and six reactions am with template (+temp.). The RQ ratio was calculated for each individual 
reaction and averaged to give the reported RQ" and RQ^ values. 



dKided by the emission intensity of the 
quencher to give an RQ ratio for each 
reaction tube. This normalizes for well- 
to-well variations in prot>e concentra- 
tion and fluorescence mea.su rement. Fi- 
nally, ARQ Is calculated by subtracting 
the RQ value of the no-template control 
(RQ") from the RQ value for the com- 
plete reaction including template 
(RQ*). 

RESULTS 

A series of probes with increasing dis- 
tances between the fluorescein reporter 
and rhodamine quencher w^re tested to 
investigate the minimum and maximum 
spacing that would give an acceptable 
performance in the 5' nuclease PGR as- 
say. These probes hybridize to a target 



sequence in the human p-actin gene. 
Figure 2 shows the results of an experi- 
ment in which these probes were in- 
cluded in PGR that amplified a segment 
of the p-actin gene containing the target 
sequence. Performance in the 5' nu- 
clease PGR assay is monitored by the 
magnitude of ARQ, which is a measure 
of the increase in reporter fluorescence 
caused by PGR amplification of the 
probe target. Probe Al-2 has a ARQ value 
that is close to zero, indicating that the 
probe was not cleaved appreciably dur- 
ing the amplification reaction. This sug- 
gests that with the quencher dye on the 
second nucleotide from the 5' end, there 
is Insufficient room for Tag polymerase 
to cleave efficiently between the reporter 
and quencher. The other five probes ex- 
hibited comparable ARQ values that are 



clearly different from zero. Thus, aU five 
probes are being cleaved during PGR am- 
plification resulting in a similar increase 
in reporter fluorescence. It should be 
noted that complete digestion of a probe 
produces a much larger Increase in re- 
porter fluorescence than that observed 
in Figure 2 (data not shown). Thus, even 
in reactions where amplification occurs, 
the majority of probe molecules remain 
undeaved. It is mainly for this reason 
that the fluorescence intensity of the 
quencher dye TAMRA changes little with 
amplification of the target. This is what 
allows us to use the 582-nm fluorescence 
reading as a normalization factor. 

The magnitude of RQ" depends 
mainly on the quenching efficiency in- 
herent in the specific structure of the 
probe and the purity of the oligonucle- 
otide. Thus, the larger RQ~ values indi- 
cate that probes AM4, AM9, Al-22; tod 
A1.26 probably have reduced quenching 
as compared with Al-7. Still, the degree 
of quenching is sufficient to detect a 
highly significant inaease in reporter 
fluorescence when each of these probes 
is cleaved during PGR. 

To further investigate the ability of 
TAMRA on the 3' end to quench 6-FAM 
on the 5' end, three additional pairs of 
probes were tested in the 5' nuclease 
PGR assay. For each pair, one probe has 
TAMRA attached to an internal nucle- 
otide and the other has TAMRA attached 
to the 3' end nucleotide. The results are 
shown in Table 2. For all three sets, the 
probe with the 3' quencher exhibits a 
ARQ value that is considerably higher 
than for the probe with the internal 
quencher. The RQ" values suggest that 
differences in quenching are not as great 
as those observed with some of the Al 
probes. These results demonstrate that a 
quencher dye on the 3' end of an oligo- 
nucleotide can quench effidentiy the 



TABLE 2 Results of 5' Nuclease Assay Gomparing Probes with TAMRA Attached to an Internal or 3'-terminal Nucleotide 



518 nm 



582 nm 



Probe 


no temp. 


+ temp. 


no temp. 


+ temp. 


RQ- 


RQ-^ 


ARQ 


A3-6 
A3-24 


54.6 ± 3.2 
72.1 ± 2.9 


84.8 ± 3.7 
236,5 ± 11.1 


116.2 ± 6.4 
84.2 ± 4.0 


115.6 ± 2.5 
90.2 '± 3.8 


0.47 ± 0.02 
0.86 ± 0.02 


0.73 ± 0.03 
2.62 ± 0.05 


0.26 ± 0.04 
1.76 ± cos 


P2-7 
P2-27 


82.8 ±4.4 
113.4 ±6.6 


384.0 ± 34.1 
555.4 ± 14.1 


105.1 ± 6.4 
140.7 ± 8.5 


120.4 ± 10.2 
118.7 ± 4.8 


0.79 ± 0.02 
0.81 ± 0.01 


3.19 ± 0.16 
4.68 ± 0.10 


2.40 ± 0.16 
3.88 ± 0.10 


PS-IO 
P5-28 


77,5 ± 6,5 
64.0 ± 5.2 


244.4 ± 15.9 
333.6 ±12.1 


86.7 ± 4.3 
100,6 ± 6.1 


95.8 ± 6.7 
94.7 ± 6.3 


0.89 ± 0.05 
0.63 ± 0.02 


2.55 ± 0.06 
3.53 :t 0.12 


1.66 ±0.08 
2.89 ± 0.13 



Reactions containing the Indicated probes and calculations were performed as desalbed in Material and Methods and in the legend to Fig. 2. 
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fluorescence of a reporter dye on the 5' 
end. The degree of quenching is suffi- 
cient for this type of oligonucleotide to 
be used as a probe in the S' nuclease PGR 
assay. 

To test the hypothesis that quenching 
by a 3' TAMRA depends on the flexibility 
of the oligonucleotide, fluorescence was 
measured for probes in the single- 
stranded and double-stranded states. Ta- 
ble 3 reports the fluorescence observed 
at 518 and 582 nm. The relative degree 
of quenching is assessed by calculating 
the RQ ratio. For probes with TAMRA 
6-10 nucleotides from the 5' end, there 
is little difference in the RQ values when 
comparing single-stranded with double- 
stranded oligonucleotides. The results 
for probes with TAMRA at the 3' end are 
much different. For these probes, hy- 
bridization to a complementary strand 
causes a dramatic increase in RQ. We 
propose that this loss of quenching is 
caused by the rigid structure of double- 
stranded DNA, which prevents the 5' 
and 3' ends from being in proxinxity. 

When TAMRA is placed toward the 3' 
end, there is a marked Mg^*^ effea on 
quenching. Figure 3 shows a plot of ob- 
served RQ values for the Al series of 
probes as a function of Mg** concentra- 
tion. With TAMRA attached near the 5' 
end (probe Al-2 or Al-7), the RQ value at 
0 mM Mg^* Is ordy slightly higher than 
RQ at 10 mM Mg^-**. For probes Al-19, 
Al-22, and Al-26, the RQ values at 0 mw 
Mg^"" are very high, indicating a much 



reduced quenching efficiency. For each 
of these probes, there is a marked de- 
aease in RQ at 1 mM Mg^* followed by 
a gradual decline as the Mg^^ concen- 
tration increases to 10 mM. Probe Al-14 
shows an intermediate RQ value at 0 mM 
Mg^"^ with a gradual decline at higher 
Mg^* concentrations. In a low-salt en- 
vironment with no Mg*"^ present, a sin- 
gle-stranded oligonucleotide would be 
expected to adopt an extended confor- 
mation because of electrostatic repul- 
sion. The binding of Mg^'' ions acts to 
shield the negative charge of the phos- 
phate backbone so that the oligonucle- 
otide can adopt conformations where 
the 3' end is close to the 5' end. There- 
fore, the observed Mg^*^ effects support 
the notion that quenching of a 5' re- 
porter dye by TAMRA at or near the 3' 
end depends on the flexibility of the oli- 
gonucleotide. 

DISCUSSION 

The striking finding of this study is that 
it seems the rhodamine dye TAMRA, 
placed at any position In an oligonucle- 
otide, can quench the fluorescent emis- 
sion of a fluorescein (6-FAM) placed at 
the S' end. This implies that a single- 
stranded, double-labeled oligonucle- 
otide must be able to adopt conforma- 
tions where the TAMRA is close to the 5' 
end. It should be noted that the decay of 
6-FAM in the excited state requires a cer- 
tain amount of time. Therefore, what 



TABLE 3 Comparison of Fluorescence Emissions of Single-stranded and 
Double-stranded Fluorogenic Probes ^ 



518 nm 



582 nm 



RQ 



Probe 



ss 



ds 



ds 



ss 



ds 



Al-7 

Al-26 

A3-6 

A3.24 

P2-7 

P2-27 

P5-10 

P5-28 



2775 
43.31 
16.75 
30.05 
35.02 
39.89 
27.34 
33.65 



68.53 
509.38 

62.88 
578.64 

70.13 
320.47 
144.85 
462.29 



61.08 
53.50 
39.33 
67.72 
54.63 
65.10 
61.95 
72.39 



13B.18 
93.86 
165.57 
140.25 
121.09 
61.13 
165.54 
104.61 



0.45 
0.81 
0.43 
0.45 
0.64 
0.61 
0.44 
0.46 



0.50 
5.43 
0.38 
3.21 
0.58 
5.25 
0.87 
4.43 



(ss) Sinale-stranded. The fluorescence emissions at 518 or 582 nm for solutions containing a fmz\ 
concentraUon oi SO nu indicated probe, 10 mM Trls-HCl (pH 8,3), 50 mM KCl, and 10 n»M MgOz. 
(ds) Double-stranded. The solutions contained, in addiUon, 100 nw AlC for probes Al-7 and 
Al-26, 100 nM A3C for piobes A3-6 and A3-24, 100 nw P2C for probes P2.7 and P2.27. or 100 DM 
PSC for probes P5.10 and PS-28- Before the addition of MgO^, 120 nl of each sample was heated 
at 95X for 5 min. Following the addition of 80 ,U of 25 mM MgCl,, each sample was allowed to 
cool to room temperature and the fluorescence emissions were measured. Reported values are 
the average of three determinations. 



matters for quenching is not the average 
distance between 6-FAM and TAMRA 
but, rather, how close TAMRA can get to 
6-FAM during the lifetime of the 6-FAM 
excited state. As long as the decay lime of 
the excited state is relatively long com- 
pared with the molecuiai motions of the 
oligonucleotide, quenching can occur. 
Thus, we propose that TAMRA at the 3' 
end, or any other position, can quench 
6-FAM at the 5' end because TAMRA is in 
proximity to 6-FAM often enough to be 
able to accept energy transfer from an 
excited 6-FAM. 

Details of the fluorescence measure- 
ments remain puzzling. For example. Ta- 
ble 3 shows that hybridization of probes 
Al-26, A3-24, and P5-28 to their comple- 
mentary strands not only causes a large 
increase in 6-FAM fluorescence at 518 
nm but also causes a modest increase in 
TAMRA fluorescence at 582 nm. If 
TAMRA Is being excited by energy trans- 
fer from quenched 6-FAM, then loss of 
quenching attributable to hybridization 
should cause a decrease in the fluores- 
cence emission of TAMRA. The fact that 
the fluorescence emission of TAMRA in- 
creases indicates that the situation is 
more complex. For example, we have an- 
ecdoul evidence that the bases of the 
oligonucleotide, especially G, quench 
the fluorescence of both 6-FAM and 
TAMRA to some degree. When double- 
stranded, base-pairing may reduce the 
ability of the bases to quench. The pri- 
mary factor causing the quenching of 
6-FAM in an intact probe is the TAMRA 
dye. Evidence for the importance of 
TAMRA is that 6-FAM fluorescence 
remains relatively unchanged when 
probes labeled only with 6-FAM are used 
in the 5' nuclease PGR assay (data not 
shown). Secondary effectors of fluores- 
cence, both before and after cleavage of 
the probe, need to be explored further. 

Regardless of the physical mecha- 
nism, the relative independence of posi- 
tion and quenching greatly simplifies 
the design of probes for the 5' nuclease 
PGR assay. There are three main factors 
that determine the performance of a 
double-labeled fluorescent probe in the 
5' nuclease PGR assay. The first factor is 
the degree of quenching observed in the 
intact probe. This is characterized by the 
value of RQ \ which is the ratio of re- 
porter to quencher fluorescent emis- 
sions for a no template control PGR. In- 
fluences on the value of RQ" include 
the particular reporter and quencher 
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the 5' end and were designed so that any 
mismatches were between the reporter 
and quencher. Increasing the distance 
between rieporter and quencher would 
lessen the disruptive effect of mis- 
matches and allow cleavage of the probe 
on the incorrect target. Thus, probes 
with a quencher attached to an internal 
nucleotide may still be useful for allelic 
discrimination. 

In this study loss of quenching upon 
hybridization was used to show that 
quenching by a 3' TAMRA is dependent 
on the flexibility of a single-stranded oli- 
gonucleotide. The inaease in reporter 
fluorescence intensity, though, could 
also be used to determine whether hy- 
bridization has occurred or not. Thus* 
oligonucleotides with reporter and 
quencher dyes attached at opposite ends 
should also be useful as hybridization 
probes. The ability to detect hybridiza- 
tion in real time means that these probes 
could be used to measure hybridization 
kinetics. Also, this type of probe could be 
used to develop homogeneous hybrid- 
ization assays for diagnostics or other ap- 
pUcations. Bagwell et a\P^^ describe just 
this type of homogeneous assay where 
hybridization of a probe causes an in- 
crease in fluorescence caused by a loss of 
quenching. However, they utilized a 
complex probe design that requires add- 
ing nucleotides to both ends of the 
probe sequence to form two imperfect 
hairpins. The results presented here 
demonstrate that the simple addition of 
a reporter dye to one end of an oligonu- 
cleotide and a quencher dye to the other 
end generates a fluorogenic probe that 
can detect hybridization or PGR arhplifi- 
cation. 
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FIGURE 3 Effect of Mg^*^ concentration on RQ ratio for the Al series of probes. The fluorescence 
emission intensity at 518and 582 nm was measured for solutions containing 50 nM probe, lOmM 
Tris-HCI (pH 8.3), 50 mM KQ, and varying amounu (0-10 mM) of MgCl2- calculated RQ 
ratios (518 nm intensity divided by 582 nra intensity) are plotted vs. MgOj concentration (mM 
Mg). The key (upper right) shows ttie probes examined. 



dyes used, spacing between reporter and 
quencher dyes, nucleotide sequence 
context effects, presence of structure or 
other factors that reduce flexibility of 
the oligonucleotide, and purity of the 
probe. The second factor is the efficiency 
of hybridization, which depends on 
probe Ttai presence of secondary struc- 
ture in probe or template, annealing 
temperature, and other reaction condi- 
tions. The third factor Is the efficiency at 
which Taq DMA polymerase cleaves the 
bound probe between the reporter and 
quencher dyes. This cleavage is depen- 
dent on sequence complementarity be- 
tween probe and template as shown by 
the observation that mismatches in the 
segment between reporter arid quencher 
dyes drastically reduce the cleavage of 
probe/^^ 

The rise in RQ" values for the Al se- 
ries of probes seems to indicate that the 
degree of quenching is reduced some- 
what as the quencher is placed toward 
the 3' end. The lowest apparent quench- 
ing is observed for probe Al-19 (see Fig. 
3) rather than for the probe where the 
TAMRA is at the 3' end (Al-26). This is 
understandable, as the conformation of 
the 3' end position would be expected to 
be less restricted than the conformation 
of an internal position. In effect, a 
quencher at the 3' end is freer to adopt 
conformations close to the 5' reporter 
dye than Is an internally placed 
quencher. For the other three sets of 



probes, the interpretation of RQ~ values 
is less clear-cut. The A3 probes show the 
same trend as Al, with the 3' TAMRA 
probe having a larger RQ~ than the in- 
ternal TAMRA probe. For the P2 pair, 
both probes have about the same RQ~ 
value. For the P5 probes, the RQ~ for the 
3' probe is less than for the internally 
labeled probe. Another factor that may 
explain some of the observed variation is 
that purity affects the RQ" value. Al- 
though all probes are HPLC purified, a 
small amount of contamination with 
unquenched reporter can have a large ef- 
fect on RQ-, 

Although there may be a modest ef- 
fect on degree of quenching, the posi- 
tion of the quencher apparently can 
have a large effect on the efficieiicy of 
probe cleavage. The most drastic effect is 
observed with probe Al-2, where place- 
ment of the TAMRA on the second nu- 
cleotide reduces the efficiency of cleav- 
age to almost zero. For the A3, P2, and P5 
probes, ARQ is much greater for the 3' 
TAMRA probes as compared with the in- 
ternal TAMRA probes. This is explained 
most easily by assiuning that probes 
with TAMRA at the 3' end are more likely 
to be cleaved between reporter and 
quencher than are probes with TAMRA 
attached internally. For the Al probes, 
the cleavage efficiency of probe Al-7 
must already be quite high, as ARQ does 
not increase when the quencher is 
placed closer to the 3' end. This illus- 



trates the importance of being able to 
use probes with a quencher on the 3' 
end in the 5' nuclease PGR assay. In this 
assay, an increase in the intensity of re- 
porter fluorescence is observed only 
when the probe is cleaved l>etween the 
reporter and quencher dyes. By placing 
the reporter and quencher dyes on the 
opposite ends of an oligonucleotide 
probe, any cleavage that occurs will be 
detected. When the quencher is attached 
to an internal nucleotide, sometimes the 
probe worlcs well (Al-7) and other times 
not so well (A3-6). The relatively poor 
performance of probe A3-6 presumably 
means the probe is being cleaved 3' to 
the quencher rather than between the 
reporter and quencher. Therefore, the 
best chance of having a probe that reli- 
ably detects accumulation of PGR prod- 
uct in the S' nuclease PGR assay is to use 
a probe with the reporter and quencher 
dyes on opposite ends. 

Placing the quencher dye on the 3' 
end may also provide a slight benefit in 
terms of hybridization efficiency. The 
presence of a quencher attached to an 
internal nucleotide might be expected to 
disrupt base-pairing and reduce the T^ 
of a probe. In fact, a 2''C:-3°C reduction 
in has been observed for two probes 
with internally attached TAMRAs.^^^ This 
disruptive effect would be minimized by 
placing the quencher at the 3' end. Thus, 
probes with 3' quenchers might exhibit 
slightly higher hybridization efficiencies 
than probes witfi internal quenchers. 

The combination of inaeased cleav- 
age and hybridization efficiencies means 
that probes with 3' quenchers probably 
will be more tolerant of mismatches be- 
tween probe and target as compared 
with internally labeled probes. This, tol- 
erance of mismatches can be advanta- 
geous, as when trying to use a single 
probe to detect PCR-amplified products 
from samples of different species. Also, it 
means that cleavage of probe during PGR 
is less sensitive to alterations in an- 
nealing temperature or other reaction 
conditions. The one application where 
tolerance of mismatches may be a disad- 
vantage is for allelic discrimination. Lee 
et al."^ demonstrated that aliele-spedfic 
probes were cleaved between reporter 
and quencher only when hybridized to a 
perfectly complementary target. This al- 
lowed them to distinguish the normal 
human cystic fibrosis allele from the 
MS08 mutant. Their probes had TAMRA 
attached to the seventh nucleotide from 
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w;i;;. <1^pedT7ove. "real time" cuanOu.lvc P« -^-JJ^;- 
aco,n,ul3tion through a duaW<,b«l«J nuoiogenlc P^l^n L:oSr aua.^ 

PCrI txiremcly accurate and less labor-intensive than airrenc quant.taovr PCR methods. . 



Quantitative midcic acid seqiiencc atialysis lias 
had an impoflani rtilfe in many fields of biologi- 
cal research, McasuTtMueni of gL'Ut expression 
(RNA) has b«w\ vised cxtciisivcly In moniiorljig 
biological responses lo various stimuli flaii v\ ai. 
1991; Hu-anR ci al. l99Sy,b; l>rud'honinic ft al. 
1995). Quantiiatlvc gen^ analysis pNA) has 
Ix-cn used to cK^ltrrmiiic the ^cntJiJU' 4u;«iitJly of 
particular gene, as in the case, oMhC human H1LR2 
gene, which Is amplified in -30% of breast iu- 
mors (Slamon ci ai. 1987). c;enc and genome 
quantitation (.DNA and UNA) also have been tised 
fur analysis of human inununodcUcicncy virus 
(lUV) burden dcmonstralin^ changes in the lev- 
els of virus thioughoul the dlffttrent phases of ihe 
disease tConnor tit al. 19^3; Pliitak ci al. l^^rMr, 
J'uTtadfi ei al. 1995). 

Many methods have been described for thi: 
quantitative analysis ot nucictic acid sequences 
(both for RNA and DNA; Soutn4?Tn 1 V/S; SJiarp el 
al. 1980; Thomds 19H0). Recently, I'CK lias 
proven to be a powerful tool for quanttiative 
nucleic acid anal>'si5. PCJR and reverse transcrip- 
tase (KIVPO have permitted Ihc analysis of 
minimal starting quantities of nucleic acid (as 
liule a.'i one cell equivalent). This has made pos- 
sible many experimcnis that could not have been 
perfornhid with traditional methods. Although 
PCR has provided n powerful tool, it is imperative 



that it be ubcd i>ropcrJy for qunntiUition (H»«y. 
matkers 1995), Many early rcjwls of quantila- 
tivt: I'CK and R'l-PCK described quantitation of 
the rCR product but did not measure the Initial 
target sequence, quantity. II is essentia] to design 
proper controls for Ihc quantitation of the initial 
target sequences (Ferrc 1992; ClcTru-ntl et al, 

KeN^J^iifcheis have, developed several methods 
of quantitative PCR aiid IVl-PCR, One approai:h 
meastires KIR i^roduct quantity in the log phase 
of Ihc reatiion before the plateau (Kellogg; ct al. 
1990; Vang ct a). 1990). This method requires 
thai each saniplci has equal Input amounts of 
nucleic- add and that each somj^lc under analysis 
amplifies with idciilu^^l efficiency up to the. point 
of quuuijl«livc analysis. A gene sequence (cm- 
tained iu M samples at rtflativcly constant quan- 
Ihit^r.^ such as p-aclln) cm be uB«d for samjila 
ao.pl.T.cation eiHcicncy normaU2nti<m, Usln^ 
conventional methods of ?c:r detection and 
quantitation (gel electrophoresis or plaio capture 
hybridlzatifm), it is extremely laborious to assure 
that ail samples are analyzed during the log phase 
of the reaction (for bolh the target gene and ihc 
normalization gene), AnoUier method, quanllta* 
live competitive (QC)-PC^K. has been developed 
and i>>; tiscd widely for PCR quantitation. CJC-l'CR 
rr.lics on the inclusion of an internal control 
compcUlor h) each reaction (Bcckw-Andrc 1991; 
Matak cl al I993a,b). The c^Qcicncy of each re 
acuon is nomialhxd lo the Inlcrnol compel itor. 
A known aiTuiunt of InttimaJ coin|>ctitur can bo 
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oUdod \o each Rainplv. To oblaan rclHtivc r\\\SinU 
tatlon, ihc unknown larRci PGR prcxluci is amw 
j>arcd with the known competitor k:U inoduct. 
Success uf £1 cjuarMllaMvc coinpctUivc PCK assay 
relics on acvcloping an internal tviUri>J Urn I am- 
jilifii-:?* wUh mc Sflmc effjc:tency as ilic Umh^V ii^ol- 
tvule. The dC5*Rn of t)u' compctJlui eiud Ihc voJI- 
daiion of ampllflcailon cffidcjici*.-* u-quirc b 
dedicated cfkut. However because QC-PCIH dws 
not rccjidre ihn\ PCM juikUicIs be aiiidyxtnl during 
the Joj; phnse uf llic iinijjlincaiion, it is tin; o^sler 
uf ihr iwcj methods to uic. 

Scverjil dt!t«ctUjn syslcin* wiv UMtd foj quan 
titative I'CK and iri-PC;w ajialysis: <1) ajiiiro-so 
gels, (2) Xluur(ra*.-ciit lybeJluK 1*^'*^ pi'oducls nnd 
dclccUon with ln.ii"n--in<liK:rd flucircscencc \ialn{; 
capillary cletrtrcaphoTcsics (h«st-« ct aJ. 1995; WIU 
llftms CT a). 1996) or acrylanUde tjelit, imd jilitic 
capUiTtt and sandwich prube liybriUi/^ilKHt (Mul- 
der et aL 1994). A!tllOU(^ll tht*»e IJIvn»t>dN jjruvt:d 
SUCCeJisfui. each metliod ruqulres pobl-l*CR ma- 
mpularlons that add Tiiuti U) llie analyiiis ajid 
(n«y leiid lu JitbuNitut)' i unt a nil nation. The 
sainplc Lhnjuglipul uf lhe>r lll^llltJd^ i.s ilmlicd 
(wH)i U^ir i-xccpilon of the plate capture iip- 
proiuii), «ntl, th«ri:fwTi:, ihcac mcthodA ore not 
well sailed fiM dcniMnding high snmplc 

throughput (I.e., NcreenJiiK of lHr>;e nujnber:^ of 

tJc* or clinicol Uial.N), 

Here vvv rtT[5c?rt tht: dcvitlopnicnt ol* a novel 
iiA-say for quanlh^stive DNA anttlysi5. The assay is 
bfistid on usr t>r ih« ,S' nufW'to.w assay flrai 
described by Holland et al. (1993), TJie Jin-Uiod 
» i.st..s 1 h<- 5 ' n uc I ca.ic i v i t y of 7 Vi// |>t ;i y m e. rtt.ic t < i 
i:lcavc a noncxtcndiblc: hybridlwuion probe dur- 

t>ir c^tciKsion J?h(i3r I'CU. T>u^ npprurtcli 
uats dviiil-Iabclcd fluoro^enic hybridi/.uDon 
probes (Lcc ct nJ. 19^3; Muwlcr ct tiK l9f>3; MvaU 
rt ill, l$9rjo,V>). One nuore:*evnt dyv ^vrvva n 
reporter IF AM (i.e., (^CcirboxyfluoredCein)! nnd i1% 
emission specuo is quenched by llic scccrnd fluo- 
r<tscrni dye., TAMRA (l.ft,, 0-cari>oxy-ietramethy!- 
rhodaminc). The nuclease degradation of Ihc hy- 
hr1d1yiitU5ii pruhe reluascs the quencliin^ t»f iJje 
I'AM fluorescent emission, rebiiUinj; in an in- 
crease hi peak Huorescenl emission at 5t}i^ nin. 
'I'Jic usti Ot a sequence dctccior (AUJ i*rism) allows 
mcasuiemeni of flnore^scent s])ectni of ail ^6 wells 
uf rhe tnormal cycler continuously Uurtuft iiic 
K:k nmpMcatiOTi. ThiTefore, tlie rcucliun> ujc 
HinnHnn.il in real lime, T3ic ouipui data is de- 
scribed and quantitative analyab t)f input Ui>;ci 
DNA sequences 15 disawsed below. 



RESULTS 

PCR Produce Derealon in Rwal Time 

'l1)e goal was \n develop a Inglvthroughpul. son- 
xitive, «in<l necuratc j^cne qowttlhallon assay for 
use In m<ndlorlng lipid mediated iht'.fapCUTic 
gene dc-live.ry. A plasinld imcodinR human factor 
VHl gCiic j:e<p3«nce, pl'STM (soc MctluKls). w;»s 
iwcd as a mcnlel iliorapciUie Ke.n<i. 'i'hc assay usr<i 
fluorescent Taqman methodology and an instru- 
ment u^j^abU: of mcasiirinM flijnrc.':ccnco in real 
liinc (Alii Prism 7700 Sequence neierinr). ilie 
•t'«qni;«" r<sicUon requires n hybridla:dion pTol>e 
lttl>cJed witlj two different fluorcfieenl dyes. One 
dye is a f cporUu dy« (l''AM>, the ot^er is :i quench- 
ing »^lyc (TA\<RA). When the priiU: I.s inlacl, fluo- 
tcsccjit energy transfer occurii and tlic reporter 
dye fluorescent emtssion is ob:iOrbcd by the 
qucnclilng dye (TAWRA'j. During t))o cxtenftlon 
phase of the PCK cycle, the. fluorcstx-ni liybrid- 
l/jjllt^ii jifolTc K cleaved by tiic S'-.T nuclcolytic 
activity of tilt! DNA polymerase. On doavagc of 
llic probe, the reporter dye emission is jio Inn^or 
transferred efficiently to ii>e cjuonc.hinfi dye, re 
:juJtiiiK an iuerease of the roportor t!y« fluorcfc- 
cent cini.«ion (ipeCtra. VCIK priincrs mid probuN 
were ^ieriij^netl fc*i lliu Jiuniaii fdclof Vlli se- 
quence and human p-ijctln gene (a».t dv..^crii>ed in 
Methods). Optiuiizfttion reactions were per- 
formed Uj ciioose the appropriate probe and 
magnosluni concenuatiom yielding the liiglu-5t 
Initin.'ijly of reixaricr fluorescent signal wilhout 
sacrificing specificity. The In.Mrumeni uses a 
chhrKC'COuplcd device (i.e.. CCD camenO for 
mea.^\iring the fluort^cnt eniisjiiox'i apeelm from 
.^^iOf) tf> C$0 nm, i:ach VCM tube was monitored 
si^tpientirtUy ft^r 2/i rn.sue whli ciMttlnuous moni- 
toring thTOUghotlt the aiupHfieLitit>n. liacll lubc 
wo.n rr-cxan lined every B.5 :>cc. Computer soH- 
ware. was dc:.-signed to exainijic tiie fixiorescent |JV 
tensity of both the rei>orter dye (l'AW).and 
the quenching dye CrAMJlA). The Ituoresccnt 
intensity of t)io quendiing dye, 'I*AMUA, chting^-y 
very liul«! over the course tjf the PCR amplifl* 
caiJon (data not shown), 'n^e.rf.fore, the Intensity 
of TAMllA dye omission server* hs ijii Uitcrniil 
.nlaiidtLTd with which to TKir!nuU>-e the reporter 
dye (HAM) cmls&lon vnriatJons, T\\u .lofiware cnl- 
culoles i4 vrtUie termed ^Kn (or ARQ> using the 
following equation: ARn - (lln'') (nii")i wlicre 
Un"* . eml,»fSlun inlcnxlly \>t foponcr/cmission in- 
tensity of quencher a I any given ilnic In a reae 
tlon tube, and Rii r- emi«ion intensility of re- 
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poncT/omlS5lm) JmvilMly v^f qucnt:Uer mCftMJrcd 
prior 10 rCU tJni];IUication in ihar same roJiction 
tube. I'or the purpmc of quaiituatlou, the Idsi 
three data j^oinVS (ARns) a)lH:tti:U iluring thi*. e^- 
twisiOJl step for eadi VCM cycK* were annly/ed. 
TJic niic.leolyUc dcj^radoTiou of the hyuriOiy-iiion. 
])robc occurs (luring ihc cxierisjun phas^of 
and, thttrvforc, repoficr fluorescent aiusMun hv 
creases Ouring this ihne. Hut ihjcc dain points 
wcrt averaged for cacJi KiK cydc and xhc ine«ii 
value fvjr oacli was plOTTcU in an "aiupllllcatiun 
plot" shown In J^i^urc 3 A- Tlic Al<n mean vulw U 
plOUed on Ihe }'-axis, and xinie, represented l;y 
cycic number, is plot! wl on Ujv ^-dxls. Durinj; the 
ttarly cycUv: tjf the VCll amp)ifit:ation, thv ARn 



value remains at bastir line When sufficlenl liy- 
bridiz-allon probe hos bo«n cleaved by 11 je Tun 
]X>lymcrastc nuriftftfift ftCtivWy, thti ialeiisily of ro. 
}Xjrtv-r nvicirrj^ctmi emJsiiioti infLTtafcivb. PCR 
ain(>linv4»ijoiis reach u plaitfau pboNo of reporter 
nuurcHV.iil cfTiifiSlon Uthe rc«t;lit«) is carried onl 
Lo high cycle nunilwis- The a/n]^Hfir«Uon plot l9 
exttnihu:d vuily in lht» ffcaclion, ut a point lhai 
it.-j>jcscnis ihv phfliio of prcRliici arniinula* 
lion. This is done by ustiflning an aibiUniy 
ilircshold tuji is bused on the varUibilily of the 
baNtt-linc dMU. h-) Mgiire 1 A, the Ihfwhold whsxci 
ai 10 standard devi<ilionK aUivc ihc mean of 
Via(vo line eniissuiii luilculated from tyvilcn 1 lo 1 S. 
Once the threshold Is chosen, the point at which 
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Flaure 1 PCR product detection in real tln^c. (A) The. Model 770D .uliware will ^?I'*"^?*S'\f'H!' 
fro " the extension phase fluorescent emission data colleaed during the PCR ampWcaUon. The rtanda^ de- 
Son is del::^,;;?ed'lron, the data points collected from Ih. base line "^g'^^^^^^ ^ 
rairulatpd Liv determining the poim ai which the fluorescence exceeds a threshold llmil (usually lo times tne 
iTndaS deL^^^^^^ («) Overlay ot amplification plols of serially (1:2) diluted human genomtc 

Sma i^^^^ with p-actin primers. (6 lnpul?.NA concentration of samples P'ottcd versu^^ 
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the amplincotion plot crobsvo Ihc thfC::hpWivcltf 
fined as C,. C,- is rc|>oTtcci uh iho cycle numbcT ;ii 
thU jxjiot. Ar Will be dcmunstruttwl, th« C!., .value 
h jJitriAicLivt of ihc quantity of injnit tiirjjtit. 

Cy Values Provide a Quantitative Hea«urcmcnf. o>' 
Input Target Sequences 

Plgufc IB shows ampiificalion ploi» o{ lii dilW. 
eut PGR ftn^pJiflcailona ovcrlairi. 'H-ic nmpltficu- 
tions wore porformed on a 1:2 serial cHhitJorj -trf. 
human genomic JWA. I'hc nnipUficd tariici w:u 
human p nctm. Tho :imp]ifi cation ))lotB Khifl to 
the right (to higher threshold cycles) m the injnit 
l^rgot quantity i<i reduced, 'JTiif. is oxpoctod hfl. 
cjiUKU nmctW)TtK with h\wi\T ?;t.irtin{5 t'opitw of the 
largcl molecule require greater amplification to 
degrade enough probe to aTf<iin the rhresholc] 
fluorescence. An arbitiaiy thmhold of 10 stTin- 
dard deviations above the base line was used to 
detenu in etlic C'*-|. value:;. Fij;ure IC] repicsents the 
Cy values plotted versus ihe siiuipJe UilutJou 
value, Each dilution was amplUied in trlpiicatc 
P<:R amphtic^iricms and plotti^d as mean vahuss 
with error bais representing one siandard dc»via- 
tion. Tlic C-r viiluci dccrt'etse linearly with Incrcas* 
ing target quantity, Thus, C^i- valiu:s can be used 
as a quantitative mensujcmcnt of ilio input target 
numhei. It should be noted that tlu» amplifica- 
tion plot for the 15.6»ng sample shown Iji IHgurc 
IB does not rct'iect thi» some fluorescent rate of 
Increase exhibited l>y most of the other sam]ile.s. 
The 15,6-ng sample also arhie.vt*N- (^ndi')oint pla- 
teau at a k)wt*r fJuoresceni vaJuc tlian would he 
cxfxxtcd ha.scd on thi» Input DNA. This phcTiom- 
enon has been observed occasionally with f)t)K'T 
samples (da(a not shown) and may be attribut- 
able to latft cycle inbilntion; this hypothesis is 
slil] under investigation. It is important to ixjte 
that t)K* flattened slope and early plateau do not 
Impact significantly the calculaitvl C-^ value us 
dcmonstTHtt'd by the fli on i)k- line shown in 
Fij;;ure. 1 C, All triplicate amplIficatitMi.s nrsultcd in 
very similar Or values— the standard deviation 
did not excised 0,5 for any dilution. This expcri- 
iTient contains a > 3 00,000-fold range of Input tar- 
gel molecules. Using Cy values for C|uantllation 
permits a much larger assay range than directly 
using total fluorescent emission intensity for 
tjuantitation. The linear range oi iluorcsccni in- 
lenjity mcastirt-tmeni of ihc A15I l^risuj 7700 Sc- 
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mc»«ts over a very \i\T^(* r;u^*o of rf^Ulivo ^i;4rtln(i 
tar^jt*! qnantmuK. 

Sample Preparation Validation 

Several parameters Ulfiucncv the eflUU-nry uf 
PCM amplificatjon; magnesium and salt conceu: 
trationjn, reiictlon conditions (i.e., titne «T>d le.m- 
pe.ralure), PClt target sir.c and composition, 
primer sequences, and sample purity. All of the 
above loctors are connnon to a sinj^lc VCR assay, 
excv.])t sample to sample purUy. in an effort to 
validate the method of samjjle jireparation lor 
tlic iacior VJil assay, PCK amplitication reprnOnr- 
ihility and oll'iciency ol 10 replicate sample 
]ue]>;iTations were, examined. After genomic ONA 
was ])repared from the 10 replicate samples, the 
DNA wiis quaijtliaicd by ult/£jviolcl spectroscopy. 
AmpUncallons were performed analyzing p-aciln 
>^cjh: conttint in 100 and of total ^^eiioniic 
IWA. liach }'C:K amplification was performed in 
triplicate. Coinpdrist)ii of (!•(* values for eacii trip, 
licate siimiJle show minimal variation based on 
standard deviation and coefl'Icient of variance 
(Table 1). I^hereforc. each ol the triplicate VCM 
amjjlifications was highly rcpfOriucibU', dcrnon- 
Slrailng that real time PCK using this injtfumcn- 
inllon intrtnhices minimal variaiitm Into the 
quajnitative. I'CK analysis. C^j)nj])<»rli;un of tlie 
mean Q values of the JO replicate sample prepa- 
ratiojis also showed minimal variability, indicat- 
ing that each sample preparation yielded sijniiar 
results for (-l-actln gene quantity. 'J'hc highest C.y 
difference between any of the samples was 0.85 
and 0.73 for the UK) and 25 ng Sijmples. rcspec- 
lively. Additionally, the mnpllfUtaiion of cadi 
sanipltt exhihitttd an equivalent rate of fluorcS' 
ccjit emission intensity ciiange per amount of 
DNA target ana]y7cd flS indicaied by similar 
slopes derived from (he sample dilutions (1-ig. 2). 
Any sample containing an excess of a VC\i inhibi- 
tor would exhibit a greater measured 3-aciln Cp 
value for a given quaniliy of DNy\. In addition; 
the inhibitor would be diluted along wiih lint 
s«mpk in the dilutton anaiy.si.s (l-i^;, Z), altering 
the expected C^,' value change. Each safnpk nm- 
piJficntion yielded a similar result in the analysis, 
dcmomlraiing- llsat tliis metlujd of sample prepa- 
ration is highly reproducible, wllh regard lo 
sample purity. 

Ouancicadve Analvsis of a Plasmtd After 

ynca no/ w J «c:frT 7nn7/cn/7T 
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Tabl« 1 . 


Ropr^duclhlUt/ of Samplo Preparation Method 










100 ng 






25 ng 




Samplo 






standard 








standard 


CV 


no. 


Ct 




deviation 


cv 




fT^©An 




1 


18.24 
18.23 








20.5S 










1 t)J3 


1«.27 


0.06 






20.51 


0.03 


0 17 


2 


18.33 








20.01 










18.35 








20.59 






0 




1M4 




0.06 




?0.4l 




0.11 


3 


18.3 








20.54 










18.3 








20.6 










18,42 


18.34 


0.07 


0*36 


20.49 


20.54 


0.06 


4 


18,15 
18.23 








20.48 
20.44 










1 8.32 


18.23 


o.os 


0.46 




20.43 


0.05 




5 


18.4 
1838 








2D.OO 

20.87 










1 ft 


10.42 


0.04 






20.7:j 


0.13 


U.D 1 


6 


18,54 








21.09 










18.67 








21,04 






0,15 




19 


18.74 


0.24 


1.26 


21.04 


21.06 


0.03 


7 


18.2B 
18.36 








20.67 
20,73 










18^2 


18.39 


0.12 


0,66 


20.65 


20.68 


0.04 


0.2 


8 


18.43 
1B.7 








20.96 
20.84 










18.7? 


18.63 


0.16 


0.83 


20.75 


20.86 


0,12 


0.57 


9 


18.18 








20,46 










18.34 








20.54 






0..32 




1B.36 


18.29 


0.1 




20.48 


20..?1 


0.07 


10 


18.42 
18.57 








20.79 
20.78 










T R.r>6 


lfi..S5 


0.12 


0.6S 


20.62 


20.73 


0-1 


0.16 


Mean 


0 10) 


18.12 


0.17 


0.90 




20.66 


0.19 


0.94 



(or cojUmning a parlliil cUNA for humaji factor 
Vlll, pl-8TM. A series of iniO^fcciions wa.s sol 
up usliij^ a decreasing amouni of ibc plasmid^^O, 
4, 0.5, and O.l ^.g). I'vyiriily-rour hours poM- 
IrttJU'^fet tlon. total DNA wn^* purlflrd from t:acb 
flask uf (.rlia. p-Avlin t;cucijui!i]iLity wa^ lIjox-m h-n 
« v«Iuc for normali^iAliwii of ^^cncMiiii'. fJNA con- 
cairraUuii fivin Trodi sttiiipk-. In llils cxpniniisnt, 
|5-actin Rcnc conicm should rctniiin consiani 
relative to coral j^cnujnic DNA. H^un' ;i staowN iljc 
result of (be p^actln JDNA jnea.sureTnem (100 
total DNA dclcrmintid hy ultraviolet spetrtros* 
copy) Ot each ^^4^liJ^»lr. >Vach sample was analysed 
in triplicate and the mean li-actin values of 
the Uiplicates were plotted (error biirs ruprescin 



betw4^nn any ivuf* sanij>lc mQans wax U.i)5 C.^ Jen 
nanograms of total DNA of yach sample were also 
rxainliie:d U>t p-aclln. I'lic results «f;ujn >3ii>wed 
thrtl very similar aniouni.1 <)f genomic DNA were: 
present; the maxlniuni mean |i acnin O, value 
diffcrcj>ce wii.s 1.0. As I'igure 3 shows, tlic rate of 
P-actln C,. cliufiKe IxrLwcen the 100 and 10-ng 
sttjnx^lc.-* was siniltur (sJojx? valuc.i rango hotwcen 
3.56 and -3.45). TVii.% verifies again thiit the 
method of -sample prcparalion yields s-ajnplt^s of 
identical PCR integrit-y (i.o-. no sample cont.-^in.ed 
an excessive ainuuni of a ?CR Inhibitor), How. 
ever, ll)C-se result:; Indicate that eHcli sample con 
lalned slight diffciences in the adiifii amount of 
gunumlc DNA nualyxcd. Determination of actual 
tfunoiiiic »>NA ^oncail ration was accomplished 
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l&b 

16 
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• * 3 

" ♦ 5 

» K 

U--T 

A a 

V ft 

« in 



1^ 1.4 




lie 1*0 1.7 1J9 1^ 

log (ng Input genomtc DMA) 



M 



^i^ur« 2 SoiMple prepAraUon purity. 1 he repticdto 
samples shown in Tabl<? "> woro aU;o an^plifrnd U^ 
tripicate iising 25 ng of each DNiA sample. The fig- 
uife showi ihe input DNA conccntfiilion (TOO and 
25 ng) vs, C, In ihi- llQurp. ihp lon nnd 75 ng 
points for «ach sample are connected by a line. 



Vjy plolUng the mean (i-actio C.^ value obtained 
for «at!ti 11)0 llfc* SHiiipJv; ^"i M^^^^^** standwril 
iAiive (shown Ir: J'iH- «cUinl KcnurtUc 

DNA eoiiceiitriitio" of each sumpU:, wns ob 
talncd try cxtrapolalion to thu ^ ttj^ii. 

Pifturc shows llic incusurcd (l.tt., noi^- 
normallTicU) quHulilivrh /actor V]J) plnfiiYiid 
ONA (pTOTM) from each of tin: four transicnl cell 
IrcUKsfticUons. Each rcacilon contained 100 n^f 
total sample 13NA (as dctQ-rmiuoil by UV spcctroK- 
copy)- y-'i^c^'i sample wiis uijalyzo^l in tripUi.utc 



25- 



'% 23 



21 



20 



V 27.73 ^fiVrll- 1 



40 fi9 
_ -1.0 |t9 
— « • 0.9 MO 
. k 0.1 po 



"1 



o.a 



1 



1^ 



1.4 i.a 1.B 
log (r>g tnput ONA) 

Figure 3 Analyib of lidnsfectcd cdl DNA quantity 
and purity, ihc DNA prtparatloni' of llic four 29:5 
cell transfections (40, 4, 0.5, and 0.1 fxcj of pF8TM) 
were analyzed for the 3-actln gene. 1 00 and 1 0 ng 
(delcrmined by ulcraviolel spectroscopy) of each 
sample were ampiified in triplicate. For each 
amount of pF8TM that was transfectcd, the (i-aciln 
Q values are plotted versus the tolal input DNA 



VC:U ;rtnplificatiun». An shown, pI'BTM purifivd 
>hxaix: Jbc 20» colls decrwsas (moan C, valuosi in- 
cwtsi'?? with decreasing amounts nf plasn^td 
arumlriAcd- The mc»n C-^ viducs oblnifwd for 
pFbTW inTigure 4A were plotted on a M;«Md'jrd 
curve cc*mprl.Hed uf sieiiiiUy diUacd pFKl'M, 
shown .In Tigure 4R. The quaiinty ui pI'XI M, 
found in each <if the four iranfiroctionR wxs do- 
tcrmined by cxtrnpoiation to the * axlt: of tho 
standard curve In l»i^uro 4H. 'Ilutsc uncorrecicd 
values, b, for pWl'M were 3ioru)«lly-wd to dcicr- 
rtiinc tJie actus*! amount of pl'Sl'M ftamd per 100 
n« of genomic ONA hy u.^lng Ihc equation:. 

if X 100 lift ocniul pl-R'lTvl coplfc.'i oer 
100 (if genomic: DNA 

where u - actual genomic DNA \n a .sample and 
b >- pPH'l'M copies from the stamUud curve. 'Hie 
notinidir.cd sjuaniity of pl'BTM per 100 ng of ge- 
nomic DNA for each of tHc four Iran.Vfccilon.s l.i 
xnown in higure 4Ji, 'tTuvir rc-^uU?! Ahovf mat ihc 
quantity of factor vin pUsiiUO iissocJatevJ wnii 
the 29a cellH, 21 lir uftcr irujisfvv.iUJti, dut.uMse.s 
with OccTcaslUK pJwiiinid uiut.wiuiaijoii u.itd in 
the iraiisfcinion. The quantity of p]'«*i'M n530cj- 
aieo with 293 celb, after uansfectloii wlili 40 pig 
t>r pliismid, was 35 PSP<?f 100 ng genomic ONA. 
This results In -520 plasiiiid copies per cell. 



WSCUSSION 

We hove described a new method for qe an til ut- 
iii^ gene copy numbers using rofll'tlmc analy.sls 
of PCR amplJficatlnn.s. ReaMlmc PCK is compaf- 
iblc with cJthtn- of the two K:k (kT-PCR) ap- 
pruaclio; (1) quanUlalivc comi^iitlvi: where an 
Inteiiiul euitipcllLor for each target sequcfiee iy 
tiscd for noimaliisfltion (data not shown) or (2) 
qu a mi lailvc comparative PCK ustluy » nuinirtU^ci- 
tiou gene contained within the .<iampJe (i.e., (3-nc- 
tin) ox a "housKkeeping" gent*, for UT-PfJK, If 
equal amounts of nucleic ucld are aiialy/.cd for 
eucn .sample aiul if the ampMflcatiun effitir.nc.y 
before quantitative ariitlysib i> identical for ea<-.h 
sample, ihe iTUenial cujjttui (nwjmali^'Hition ^cne 
or competitor) should Rivc equid M^iials for all 
samples. 

The rcal-lime PCK method (offers several ad- 
viintaftcs over the other two methods currently 
employed <i>ee the Introduction), l-irsl, the reaU 
time PCR method is pe.rfonned in a doscd-tube 
system and requires no post-PC^R maT\ipalallon 
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Flgiira 4 OufintltJitivo ftnolyBijs of pFSTM in trdnsicctcd cclb. (A) Aniouni of 
plasmid DNiA U5cd for Ihc trunsfecilon plotted agaiiisi Uii; invun C, value deier- 
mincjd for pffiTM remaining ;>>| hr atlcr irensfcctJon. (0,Q Standard curvrs of 
pf-ftTM 4*nd f^-actJn, respcclivcty. pf^TM DNA <fl) ond genomic. t:>NA (Q were 
dilutftd «Arially 1 :3 before AmpliflcftUon with iKc appropriate primeri. The f^-aclin 
stAndard curvo waj; usod lo norn>ali>o the results of /\ to 1 00 ny of genomic DNA. 
(O) The amount of prSTM present p«:r 100 ng of genomic DNA. 



of Sx-irnplc. Tberefopc, (ht» p^itcntidl for 2'CU con- 
lomlnaiiuu in the Ialx)r>itory is reduced because 
amplifliid produciK cam Ik» aualyy.ed and dispcriCct 
of Without opening; tho rw^tction tubes. Seeond, 
thb method suppOiU liiu u.Mt i>r ii tu/nniiHx.ciDoi] 
Senc (i-c, p-netm) for quantitative PGR or house- 
keeping gene-s for cjuantitiitWc R1-1*CK controls. 
Analysis is pcrfomH-d n) real time durlfig the Jog 
phase of product accumulation. Analysis during 
)».»>; p}iu5e permit.1 many Olffereiit jjciies (over a 
wide input target rnnjic) to be anaJy^rd sJniulia- 
lu^ously, without concern of reaching read ion 
pliiicnu at different cyclc;>, Tim will make iiiulll- 
jl;ctu-. analysis a6Soy;» tiiuch c.a.-^iv.i \\i dcivelop, bc- 
cnuac individual intcnidl i.tMti|^e.tUui> will inA be 
needed for each gene ujidcr anaJyatn- TJiird, 
Act m pic thrciughput will ijii.jea>c OiunialictfJly 
with llic new metliod because, there no |><wt« 
VC\{ procc.'ising time. Addilionally, wcitking \n 
*^6-we)l formal \n highly cuini)atible with auto* 
iiiiitifjn lechnoloj^y. 

The real-time 1>CR mcdlOd liiyhly reprn. 
ducible. Rcpilc.ate arnpllflcations can be aiiuly^ed 



for It sample minim l;fiing ]>otcnttal error, rite. 
.systitiTi ;ilUiwN I'oT a very large assay dynamic 
ruMjje (upproae.hiiig 1,000,000 -fold starting Ui- 
i»el). Ualn^j u .standard curve for the target oi iii" 
tcrest; reliitlve cof)y number vulucK can be dcler- 
mincrf for any unkjiuwii ^a)upU^ h J u orescent 
threshold vnJuca, Oj,, conrJair. Hne<)rly with rela- 
tive DNA copy number'^. Ueal time quanlllfltive 
IVr-rc:j< methodology (Glbwji et al., this i.-wut-) 
ha.^ al.^obce.n developed, finally, real llmt: qurtu- 
titativc I'CU inetliodology can be used ii; dcveJup 
hiph-throughpi.it jcreenljig as.nay.s for n variety of 
applications [quantitative gene c:A|:/je:>aiuji (K*r- 
rCl^)j Rene copy n.'jsnya (f1cr2, HIV, elCi), gcniv 
typlng (knocKotit mou^e anulysia), and 1mm uno- 
PCUJ. 

Rcal-tisne TOU may al.to be j^rformcxl usin(^ 
intereoiiding dyes (Higtichi cl ul. such as 

ciJiidium bromide. The fluorogenic probe, 
method offers a ma|or advantage over inter- 
caiaUng dyes- greater apeciflcity (I.e., primer 
dimcrs and nonspcdflc PCR produces are not de.- 
t**A'ied). 
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METHODS 

Generation of <t PUsmld Cuiualnlns a Partial 
cDNA Tor Human Factor VUl 

TUWI RNA JuirvcAtctl tUNAK-ol W (nmi 't<0 Test, Itic, 
hncnilawood, TX) frvm cvJIj. i.-i»*feclet» with a loctor VIU 
rxiireaaiuu v^Klor, pC:iS2.lk'?.5lJ (K-atou u\ M. lUHO; G<»r. 
ninn ct al. 1990), A Uclor VIII i>arilal chNA wvi»H»!)tv Wns 
/.incTnUd by in* |<!i.iicAnip la (TlU ItNA KMl Xl( 

^an NttOK-tnyy, j'KA|i|»I»im1 hiosysicms, Vomvi r,lty, c:a)J 

usiiift \hv Vt:n i>i'joi«f» ]?Hfor hi»iI l*Rrcv (|»rifni-f M'qaitiirfA 
arc shown btflow). Hii* aimpHcon wns ^eampJifinl aMnR 
iiKHlifi^-U I'flfor and l^rcv primer J CdpjX'iuUsi wMh /i«wni 
flnd Hfircini rcslficllon sire svqucnc« hi iUv ,S' cixJ^ and 
clonal into piiVM' 3Z (rrOFTit'j'u c;of|>,. Mudwou, Wl). The 
rcSLilllnRclnnr, iiVRTM, was (.wcti lor IfanWd^t iransfealon 

Amplification of Target DNA ami Duiccilon of 
Amplicon Factor VIII Hasmid DNA 

(pHirnvi) was dinpUfiwi wiUi i-tifor 5'-<x;c;- 

<rrac(:AAt>Au:itJAtxncn\'-3' and Piiircv 5'-AAA<xrr- 

i:ACCX7rOCiA'J*Ci(;TAC'iC:-.'^'.n»e rvwciUm pfOiliitrtl ti 'I^.SU 
lip 11" :k product. Tnv forwiird prMiK*r m\> il«'>Ixnvd lu kh.' 
otS^it/.c u uxjlquv M'tpivHir {mind lii (he tO' untraii^lttt^d 
rejSlUii nf IliC! ptticiU tKJU>2.tK23l> pldMnHl (iml ihmcfofc 
di>v» itv( fvuiH"l^*v> <**<(^ amplify ihc liuu)oii fficuir VIH 
gvnvr I'riinnrR woro clioKvn wth tho av»iviit<if<» of ihc tH)m- 
pulcr prosruiu Oliso I.U (Niiliimul UUwcicnccs, Inc, l*ly. 
moijth, MN). The Iniman p-act*« fc:**Tu- wus ampl^<><i with 
llic vrliuw p-t«-*iti {tirwftril jntincr -TCACCC-AClAt rrCT 
GCCCATCTrAC;C*;A-3* and ^-acliu icvcrsc p*iiT^cr .S'-CACi. 
CGCAACCXJ(rr<:A n(;<X.AA'J 017-3'. The reaction pro- 
aviceo xv5 np i'C;u prcuiuti. 

AmpUflcnilon rc:iciian* (SO fJ) coiUHim'O a l^NA 
sample, lOX I'CU Huff»'.r II (5 ftl), 200 p-M UAIT, dCir, 
dGTP, and ^00 jim riU TP, 4 inM MgCI^, Units AmpJI 
Titij r;NA polymciflsc, U,5 unit Ainpivrnsc uracil N-Riy- 
wwyluw <UNO)i 50 pinolvof cftch fftcioi Vlll {vlmvi^ und 15 
pfitoli* <»C itttdi |< Mc.ltn pi linear. 'I1)(r i<*aWlwitv hIm* i:tm tallied 
one of t^e foMowlnj^ cietc<"ll'"» ppolw^jt (HMJ nu cnrlt)* 

j'Hjin*be A'(KAM) Ac:crjYrj'c:cucc:T<;f mcrrrcTtrr- 

GCCTTrrAMRA)p 3' auJ p-ttftiii probe 5' (rAM)ATC;U:c:- 
X0'AMKA)CCCCr:AT(:;CCATl':p-3' where p indicates 
phnsphorylAiion nnd X IndifOtcA A linker arm nucleotide. 
Rcnctioii UiIh-* wrn.' M;<:n>An\p Optk^l 'I'uIks {part AUJID- 
IhtNUOI 09.1.1, rcrWn Uluiui) tJjai wore frottU^tl («t IVrWn 
Rimer) prvvcul l!{;M /roni /cncdlnf*. Tube copi were 
slmiK^i' lo K4icr»>Ati»p Cinpa bul spceiAlly dwifirtcd lo pre- 
vent ltj;lil sciitlcrMi^. All fil lli<* \K'M wUniAtutaulvlvA wcro su>»- 
1^1 i«a l>y PK Applied ll3osy»(^no (Po*Ut CMy, CX) except 
Ihr factor Vlll prliuera, wJiivU wru" sytilbesUed iit Cenvn 
lech, Inc. (Sovilh f-w*" rrancisco, CA). Probes wit** dfsJjined 
ii.^ing the Oliyr.? 4.0 30ft wore, folluwIiiK giildclliivv uuh- 
^CMen in mc Mtuld 7700 .sequence l>cuvu>r bi.-diiniii'/il 
manual. Itrle.ny, prube i)tiMjUt he At Jeoal hiK^irr 
man fl>e annvuhnx leuipvMlurc: uActi durlnj; ihcritiul cy- 
rhng; pfimerj; sltc5\ild jk>1 fwjin d^^plexe^' with the 

pfobr. 

The ihcrnj^tl i-yrJIng condilloivs Included 2 jnin (^1 
50"C and 10 niin ul 9S*C. Hiej-ntal cycling prorrrdrd with 



rc-aciioni; wore perfonned in ib<' Modol 77{U), Sequence IV- 
t«1or Apphed Ulusyvluuiv), i*i1drh conlahiv ;* Ocor- 
Anip l»< :U .Sy&twni POOO. Ufcat:llon cvnidition* w< rf pvo. 
RnniinicU oi» .1 l'ww»r MaC}nb»li V100 (Apple C.V.iinp"<f*r, 
Sonta Uaru, CA) itnkcO dinrnly to the Model VVOA 5U'- 
t^ucitw IXtlffClor. Analy^U *»r dau w>v alRi jM^K/umi'd on 
live MHt »ntf«h eompvdof. Ctftllnctlon and an;t1yKlK tnftwyre 
wi\% dC'VclortMl Ht l»K ApplK*^! nicKy«luins. 

Tran*fection of Cells with Factor Vlll Coiulruci 

J.Viur T17.S flasltfi of 293 cells {A'VCX: CMh 157:^), n human 
fetal Uldney sutipetifcion cell line, wvrc gniwu lo fiOOii con- 
lUicni-y And tranjfctied pl-fflM. Cells were Rrnwti In tho 
h>UnwhiK inedlftt S0% HAM'S ¥}2 without GHT, 50% low) 
glucose nuJl>c*\N)'3 imidlflcd >UikIc itjccHwiu (UMlvM) with- 
cnn Rlynnu wiUi sodium bicarlxHiato, 10% iemi tKvvinc 
smiiji, 2 tMM i.-}(luldirtinc, and 1% penidllin-MrcpUimy^ 
tin. The media w« dian^'cd 30 niln Mi»"' O*** iransfcc 
lion. pru rM DNA ammintt* of 40, 4, OJv, .ind 0.1 were 
iiUitcil u> 1..*) ml of A solution conwlnlnfi O.ISA m C^uO^; 
And 1 X ni'J'liS. TlK four mixhjre^ were lefl al rt)o)n ten.- 
t^crpturr fc»f Mi min and then iKUK-*l dmpwUt- to iJ»e cells, 
'i'liv nw*K> wvu bivul-wled al 37'*C: and CO;. f<ir 24 hr, 
•viishcd wiih rilS» *itid n^^^spcndcd in PUS. The rexuM 
in-nJittl cclh were divided wtlu dliMuois und 1;>NA wfti cv- 
tr»u:ted Iniincdiulcly usiuR Oie QIAuni)' K'* (Q'^f!<'«^' 
aiot>'im>rtl^ <.V\), l>NA vvti.s r.luled Into 200 p.1 cW 30 mw* 
Trift-lia ul pll H.0, 
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ABSTRACT Wnt family members are critical to many 
developmental processes, and components of the Wnt signal- 
ing pathway have been linked to tumorigenesis in familial and 
sporadic colon carcinomas. Here we report the identification 
of two genes, WISP-1 and W75P-2, that are up-regulated in the 
mouse mammary epithelial cell line C57MG transformed by 
Wnt-1, but not by Wnt-4. Together with a third related gene, 
WISP-3j these proteins define a subfamily of the connective 
tissue growth factor family. Two distinct systems demon- 
strated WISP induction to be associated with the expression of 
Wnt-1. These included (i) CS7MG cells infected with a Wnt-1 
retroviral vector or expressing Wnt-l under the control of a 
tetracyline repressible promoter, and (h) Wnt-1 transgenic 
mice. The WISP-l gene was localized to human chromosome 
8q24.1-8q24J. mSP-1 genomic DNA was amplified in colon 
cancer cell lines and in human colon tumors and its RNA 
overexpressed (2- to >30-fold) in 84% of the tumors examined 
compared with patient-matched normal mucosa. WISPS 
mapped to chromosome 6q22-6q23 and also was overex- 
pressed (4- to > 40-fold) in 63% of the colon tumors analyzed. 
In contrast, WISP-2 mapped to human chromosome 20ql2- 
20ql3 and its DNA was amplified, but RNA expression was 
reduced (2- to >30-fold) in 79% of the tumors. These results 
suggest that the WISP genes may be dovmstream of Wnt-I 
signaling and that aberrant levels of WISP expression in colon 
cancer may play a role in colon tumorigenesis. 



Wnt-1 is a member of an expanding family of cysteine-rich, 
glycosylated signaling proteins that mediate diverse develop- 
mental processes such as the control of cell proliferation, 
adhesion, cell polarity, and the establishment of cell fates (1, 
2). Wnt-1 originally was identified as an oncogene activated by 
the insertion of mouse mammary tumor virus in virus-induced 
mammary adenocarcinomas (3, 4). Although Wnt-1 is not 
expressed in the normal mammary gland, expression of Wnt-1 
in transgenic mice causes mammary tumors (5). 

In mammalian cells, Wnt family members initiate signaling 
by binding to the seven-transmembrane spanning Frizzled 
receptors and recruiting the cytoplasmic protein Dishevelled 
(Dsh) to the cell membrane (1, 2, 6). Dsh then inhibits the 
kinase activity of the normally constitutively active glycogen 
synthase kinase-33 (GSK-33) resulting in an increase in 
0-catenin levels. Stabilized )3-catenin interacts with the tran- 
scription factor TCF/Lefl, forming a complex that appears m 

The publication costs of this article were defrayed in part by page charge 
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the nucleus and- binds TCF/Lefl target DNA elements lo 
activate transcription (7, 8). Other experiments suggest that 
the adenomatous polyposis coli (APC) tumor suppressor gene 
also plays an important role in Wnt signaling by regulating 
P-catenin levels (9). APC is phosphorylated by GSK-3P. binds 
to j3-catenin, and facilitates its degradation. Mutations in 
either APC or p-catenin have been associated with colon 
carcinomas and melanomas, suggesting these mutations con- 
tribute to the development of these types of cancer, implicating 
the Wnt pathway in tumorigenesis (1). 

Although much has been learned about the Wnt signaling 
pathway over the past several years, only a few of the tran- 
scriptionally activated downstream components activated by 
Wnt have been characterized. Those that have been described 
cannot account for all of the diverse functions attributed lo 
Wnt signaling. Among the candidate Wnt target genes are 
those encoding the nodal-related 3 gene, Xnr3, a member of 
the transforming growth factor (TGF)-p superfamily, and the 
homeobox genes, engrailed, goosecoid, twin {Xtwn\ and siamois 
(2). A recent report also identifies c-myc as a target gene of the 
Wnt signaling pathway (10). 

To identify additional downstream genes in the Wnt signal- 
ing pathway that are relevant to the transformed cell pheno- 
type, we used a PCR-based cDNA subtraction strategy, sup- 
pression subtractive hybridization (SSH) (11), using RNA 
isolated from C57MG mouse mammary epithelial cells and 
C57MG cells stably transformed by a Wnt-1 retrovirus. Over- 
expression of Wnt-1 in this cell line is sufficient to induce a 
partially transformed phenotype, characterized by elongated 
and refractile cells that lose contact inhibition and form a 
multilayered array (12, 13). We reasoned that genes differen- 
tially expressed between these two cell lines might contribute 
to the transformed phenotype. 

In this paper, we describe the cloning and characterization 
of two genes up-regulated in Wnt-1 transformed cells, WISP-1 
and WISP'2, and a third related gene, WISP-S. The WISP genes 
are members of the CCN family of growth factors, which 
includes connective tissue growth factor (CTGF), Cyr61, and 
novy a family not previously linked to Wnt signaling. 

MATERIALS AND METHODS 

SSH. SSH was performed by using the PCR-Select cDNA 
Subtraction Kit (CLONTECH). Tester double-stranded 

Abbreviations: TGF, transforming growth factor; CTGF, connective 

tissue growth factor; SSH, suppression subtractive hybridization; 

VWC, von Willebrand factor type C module. 

Data deposition: The sequences reported in this paper have been 

deposited in the Genbank database (accession nos. AF100777, 

AF100778, AF100779, AF100780, and AF100781). 
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cDNA was synthesized from 2 Mg of poly(A)^ RNA isolated 
from the C57MG/Wnt-1 cell line and driver cDNA from 2 ^g 
of poly(A)* RNA from the parent C57MG cells. The sub- 
tracted cDNA library was subcloned into a pGEM-T vector for 
further analysis. 

cDNA Library Screening. Clones encoding full-length 
mouse WISP-I were isolated by screening a AgtlO mouse 
embryo cDNA library (CLONTECH) with a 70.bp probe from 
the original partial clone 568 sequence corresponding to ammo 
acids 128-169. Clones encoding full-length human WISP-1 
were isolated by screening AgtlO lung and fetal kidney cDNA 
libraries with the same probe at low stringency. Clones en- 
coding full-length mouse and human WlSP-2 were isolated by 
screening a C57MG/Wnt-1 or human fetal lung cDNA library 
with a probe corresponding to nucleotides 1463-1512. Full- 
length cDNAs encoding WISPS were cloned from human 
bone marrow and fetal kidney libraries. 

Expression of Human WISP RNA, PGR amplification of 
first-strand cDNA was performed with human Multiple Tissue 
cDNA panels (CLONTECH) and 300 )llM of each dNTP at 
94X for 1 sec, 62°C for 30 sec, 72*'C for 1 min, for 22-32 cycles. 
WISP and glyceraldehyde-3-phosphate dehydrogenase primer 
sequences are available on request. 

In SUu Hybridization. "P-labeled sense and antisense ribo- 
probes were transcribed from an 897-bp PGR product corre- 
sponding to nucleotides 601-1440 of moust WISP-I or a 
294-bp PCR product corresponding to nucleotides 82-375 of 
mouse WISP'2. All tissues were processed as described (40), 
Radiation Hybrid Mapping. Genomic DNA from each 
hybrid in the Stanford G3 and Genebridge4 Radiation Hybrid 
Panels (Research Genetics, Huntsville, AL) and human and 
hamster control DNAs were PCR-amplified, and the results 
were submitted to the Stanford or Massachusetts institute of 
Technology web servers. 

Cell Lines, Tumors, and Mucosa Specimens. Tissue speci- 
mens were obtained from the Department of Pathology (Uni- 
versity of Pittsburgh) for patients undergoing colon resection 
and from the University of Ueds, United Kingdom, Genomic 
DNA was isolated (Qiagen) from the pooled blood of 10 
normal human donors, surgical specimens, and the following 
ATCC human cell lines: SW480, COLO 320DM, HT-29, 
WiDr, and SW403 (colon adenocarcinomas), SW620 (lymph 
node metastasis, colon adenocarcinoma), HCT 116 (colon 
carcinoma), SK-CO-1 (colon adenocarcinoma, ascites), and 
HM7 (a variant of ATCC colon adenocarcinoma cell line LS 
174T), DNA concentration was determined by using Hoechst 
dye 33258 intercalation f luorimetry. Total RNA was prepared 
by homogenization in 7 M GuSCN followed by centrifugation 
over CsCl cushions or prepared by using RNAzol. 

Gene AmpUfication and RNA Expression Analysis. Relative 
gene amplification and RNA expression of WISPs and c-myc in 
the cell lines, colorectal tumors, and normal mucosa were 
determined by quantitative PCR. Gene-specific primers and 
fluorogenic probes (sequences available on request) were 
designed and used to amplify and quantitate the genes. The 
relative gene copy number was derived by using the formula 
2(A«) where ACt represents the difference in amplification 
cycles required to detect the WISP genes in peripheral blood 
lymphocyte DNA compared with colon tumor DNA or colon 
tumor RNA compared with normal mucosal RNA. The 
a-method was used for calculation of the SE of the gene copy 
number or RNA expression level. The W^/S/'-specific signal was 
normalized to that of the glyceraldehyde-3-phosphale dehy- 
drogenase housekeeping gene. All TaqMan assay reagents 
were obtained from Perkin-Elmer Applied Biosystems. 

RESULTS 

Isolation of WISP-I and WISP-2 by SSH. To identify Wnt- 
1-inducible genes, we used the technique of SSH using the 
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mouse mammary epithelial cell line C57MG and C57MG cells 
that stably express V/nt-1 (11). Candidate differentially ex- 
pressed cDNAs (1,384 total) were sequenced. Thirty-nine 
percent of the sequences matched known genes or homo- 
logues, 32% matched expressed sequence lags, and 29% had 
no match. To confirm that the transcript was differentially 
expressed, semiquantitative reverse transcription-PCR and 
Northern analysis were performed by using mRNA from the 
C57MG and C57MG/Wnt.l cells. 

Two of the cDNAs, WlSP-I and WISP-2, were differentially 
expressed, being induced in the C57MG/Wnt-1 cell line, but 
not in the parent C57MG cells or C57MG cells overexpressing 
Wnt-4 (Fig. 1 A and B). Wnt-4, unlike Wnt-1, does not induce 
the morphological transformation of C57MG cells and has no 
effect on p-catenin levels (13, 14). Expression of WISP-1 was 
up-regulated approximately 3-fold in the C57MG/Wnt-1 cell 
line and WISP-2 by approximately 5-fold by both Northern 
analysis and reverse transcription-PCR. 

An independent, but similar, system was used to examine 
WISP expression after Wnl-1 induction. C57MG cells express- 
ing the Wnt-1 gene under the control of a tetracycline- 
repressible promoter produce low amounts of Wnt-1 in the 
repressed state but show a strong induction of Wnt-1 mRNA 
and protein within 24 hr after tetracycline removal (8). The 
levels of Wnt-1 and WISP RNA isolated from these cells at 
various limes after tetracycline removal were assessed by 
quantitative PCR. Strong induction of Wnt-1 mRNA was seen 
as early as 10 hr after tetracycline removal. Induction of WISP 
mRNA (2- to 6-fold) was seen at 48 and 72 hr (data not shown). 
These data support our previous observations that show that 
WISP induction is correlated with Wnt-1 expression. Because 
the induction is slow, occurring after approximately 48 hr. the 
induction of WISP^ may be an indirect response to Wnt-1 
signaling. 

cDNA clones of human WISP-I were isolated and the 
sequence compared with mouse The cDNA sequences 

of mouse and human WISP-1 were 1,766 and 2,830 bp in length, 
respectively, and encode proteins of 367 aa, with predicted 
relative molecular masses of ««40,000 {Mr 40 K). Both have 
hydrophobic N-terminal signal sequences, 38 conserved cys- 
teine residues, and four potential N-linked glycosylation sites 
and are 84% identical (Fig. lA). 

Full-length cDNA clones of mouse and human H75P-2were 
1,734 and 1,293 bp in length, respectively, and encode proteins 
of 251 and 250 aa, respectively, with predicted relative molec- 
ular masses of «-27,000 {Mr 27 K) (Fig. 2B). Mouse and human 
WISP-2 are 73% identical. Human WISP'2 has no potential 
N-linked glycosylation sites, and mouse WISP-2 has one at 
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Fig. 1. WlSP-i and WISP-2 are induced by Wnt-1, but not Wnt-4, 
expression in C57MG cells. Northern analysis of WlSP-1 {A) and 
WJSP-2 {B) expression in C57MG, C57MG/Wnt-1, and C57MG/ 
Wnt-4 cells, Poty(A)'' RNA (2 M-g) was subjected to Northern blot 
analysis and hybridized with a 70-bp mouse ^75/*- 7 -specific probe 
(amino acids 278-300) or a 190-bp IV75/'-2- specific probe (nucleotides 
1438-1627) in the 3' untranslated region. Blots were rehybridized with 
human j3-actin probe. 
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Fig. 2. Encoded amino acid sequence alignment of mouse and 
human WJSP-l {A) and mouse and human WJSP-2 (B). The potential 
signal sequence, insulin-like growth factor-binding protein (IGF-BP), 
VWC, thrombospondin (TSP), and C-terminal (CT) domains are 
underlined. 

position 197. WlSP-2 has 28 cysteine residues that are con- 
served among the 38 cysteines found in WISF-l. 

Identification of WISPS. To search for related proteins, we 
screened expressed sequence tag (EST) databases with the 
WISP-1 protein sequence and identified several ESTs as 
potentially related sequences. We identified a homologous 
protein that we have called WISP-3. A full-length human 
WISP'3 cDNA of 1,371 bp was isolated corresponding to those 
ESTs that encode a 354-aa protein with a predicted molecular 
mass of 39,293. WISP-3 has two potential N-linked glycosyl- 
ation sites and 36 cysteine residues. An alignment of the three 
human WISP proteins shows that WlSP-1 and WlSP-3 are the 
most similar (42% identity), whereas WlSP-2 has 37% identity 
with WlSP-1 and 32% identity with WISP-3 (Fig. 14). 

WISPs Are Homologous to the CTGF Family of Proteins. 
Human WJSP-l, WISP'2, and WISF-3 are novel sequences; 
however, mouse WISP-1 is the same as the recently identified 
Elml gene. Elml is expressed in low, but not high, metastatic 
mouse melanoma cells, and suppresses the in vivo growth and 
metastatic potential of K-1735 mouse melanoma cells (15). 
Human and mouse WISP'2 are homologous to the recently 
described rat gene, rCop-l (16). Significant homology (36- 
44%) was seen to the CCN family of growth factors. This family 
includes three members, CTGF, Cyr61, and the proloonco- 
gene nov, CTGF is a chemotactic and mitogenic factor for 
fibroblasts that is implicated in wound healing and fibrotic 
disorders and is induced by TGF-j3 (17). Cyr61 is an extracel- 
lular matrix signaling molecule that promotes cell adhesion, 
proliferation, migration, angiogenesis, and tumor growth (18, 
19). nov (nephroblastoma overexpressed) is an immediate 
early gene associated with quiescence and found altered in 
Wilms tumors (20). The proteins of the CCN family share 
fimctional, but not sequence, . similarity to Wnt-1. All are 
secreted, cysteine-rich heparin binding glycoproteins that as- 
sociate with the cell surface and extracellular matrix. 

WISP proteins exhibit the modular architecture of the CCN 
family, characterized by four conserved cysteine-rich domains 
(Fig. 3B) (21 ). The N-terminal domain, which includes the first 
12 cysteine residues, contains a consensus sequence (GCGC- 
CXXC) conserved in most insulin-like growth factor (IGF)- 
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Fic. 3. (A) Encoded amino acid sequence alignment of human 
WISPs. The cysteine residues of WISP-1 and WlSP-2 that are not 
present in WISP-3 are indicated with a dot. (5) Schematic represen- 
tation of the WISP proteins showing the domain structure and cysteine 
residues (vertical lines). The four cysteine residues in the VWC domain 
that are absent in WISP-3 are indicated with a dot. (C) Expression of 
WISP mRNA in human tissues. PGR was performed on human 
multiple-tissue cDNA panels (CLONTECH) from the indicated adult 
and fetal tissues. 

binding proteins (BP). This sequence is conserved in WISP-2 
and WlSP-3, whereas WISP-1 has a glutamine in the third 
position instead of a glycine. CTGF recently has been shown 
to specifically bind IGF (22) and a truncated nov protein 
lacking the IGF-BP domain is oncogenic (23). The von Wil- 
lebrand factor type C module (VWC), also found in certain 
coUagens and mucins, covers the next 10 cysteine residues, and 
is thought to participate in protein complex formation and 
oligomerization (24). The VWC domain of WlSP-3 differs 
from all CCN family members described previously, in that it 
contains only sbt of the 10 cysteine residues (Fig. 3 A and B), 
A short variable region follows the VWC domain. The third 
module, the thrombospondin (TSP) domain is involved in 
binding to sulfated glycoconjugates and contains six cysteine 
residues and a conserved WSxCSxxCG motif first identified in 
thrombospondin (25). The C-terminal (CT) module contain- 
ing the remaining 10 cysteines is thought to be involved in 
dimerization and receptor binding (26). The CT domain is 
' present in all CCN family members described to date but is 
absent in WISP-2 (Fig. 3 A and B).The existence of a putative 
signal sequence and the absence of a transmembrane domain 
suggest that WISPs are secreted proteins, an observation 
supported by an analysis of their expression and secretion from 
mammalian cell and baculovirus cultures (data not shown). 

Expression of WISP mRNA in Human Tissues. Tissue- 
specific expression of human WISPs was characterized by PGR 
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analysis on adult and fetal multiple tissue cDNA panels. 
WISP-I expression was seen in the adult heart, kidney, lung, 
pancreas, placenta, ovary, small intestine, and spleen (Fig. 3C). 
Little or no expression was detected in the brain, liver, skeletal 
muscle, colon, peripheral blood leukocytes, prostate, testis, or 
thymus. WISP'2 had a more restricted tissue expression and 
was detected in adult skeletal muscle, colon, ovary, and fetal 
lung. Predominant expression of WISP-S was seen in adult 
kidney and testis and fetal kidney. Lower levels of WISP'3 
expression were detected in placenta, ovary, prostate, and 
small intestine. 

In SUu Localization of WISP-l and WISP'2, Expression of 
WISP-I and WISP'2 was assessed by in situ hybridization in 
mammary tumors from Wnt-1 transgenic mice. Strong expres- 
sion of WlSP-I was observed in stromal fibroblasts lying within 
the fibrovascular tumor stroma (Fig. 4 A-D), However, low- 
level WISP'} expression also was observed focally within tumor 
cells (data not shown). No expression was observed in normal 
breast. Like WISP-I, WiSP-2 expression also was seen in the 
tumor stroma in breast tumors from Wnt-1 transgenic animals 
(Fig. 4 E-H), However, WISP-2 expression in the stroma was 
in spindle-shaped cells adjacent to capillary vessels, whereas 






Fig. 4. {A, C, £, and G) Representative hematoxylin/eosin-stained 
images from breast tumors in Wnt-1 transgenic mice. The correspond- 
ing dark-field images showing WISP-1 expression are shown in B and 
D. The tumor is a moderately well-differentiated adenocarcinoma 
showing evidence of adenoid cystic change. At low power {A and B), 
expression of WISP-I is seen in the delicate branching fibrovascular 
tumor stroma (arrowhead). At higher magnification, expression is seen 
in the stromal(s) fibroblasts (C and O), and tumor cells are negative. 
Focal expression of WISP-1, however, was observed in tumor cells in 
some areas. Images of WISP-2 expression are shown in E-H. At low 
power (£ and f), expression of WISP-2 is seen in cells lying within the 
fibrovascular tumor stroma. At higher niagnification, these cells 
appeared to be adjacent to capillary vessels whereas tumor cells are 
negative (G and H), 
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the predominant cell type expressing WISP-I was the stromal 
fibroblasts. 

Chromosome Localization of the WISP Genes. The chro- 
mosomal location of the human WISP genes was determined 
by radiation hybrid mapping panels. WISP-I is approximately 
3.48 cR from the meiotic marker AFM259xc5 [logarithm of 
odds (lod) score 16.31] on chromosome 8q24,l to 8q24.3, in the 
same region as the human locus of the novH family member 
(27) and roughly 4 Mbs distal to c-myc (28). Preliminary fine 
mapping indicates that WISP-I is located near D8S1712 STS, 
WISP-2 is linked to the marker SHGC-33922 (lod = 1,000) on 
chromosome 20ql2-20ql3.1. Human WISP-3 mapped to chro- 
mosome 6q22-6q23 and is linked to the marker AFM211ze5 
(lod = 1,000). WISP-3 is approximately 18 Mbs proximal to 
CTGF and 23 Mbs proximal to the human cellular oncogene 
MYB (27, 29). 

Amplification and Aberrant Expression of WISPs in Human 
Colon Tumors. Amplification of protooncogenes is seen in 
many human tumors and has etiological and prognostic sig- 
nificance. For example, in a variety of tumor types, c^>'c 
amplification has been associated with malignant progression 
and poor prognosis (30). Because WISP-I resides in the same 
general chromosomal location (8q24) as c-myc, we asked 
whether it was a target of gene amplification, and, if so, 
whether this amplification was independent of the c-myc locus. 
Genomic DNA from human colon cancer cell lines was 
assessed by quantitative PGR and Southern blot analysis. (Fig. 
5 A and B). Both methods detected similar degrees of WISP-I 
amplification. Most cell lines showed significant (2- to 4-fold) 
amplification, with the HT-29 and WiDr cell lines demonstrat- 
ing an 8-fold increase. Significantly, the pattern of amplifica- 
tion observed did not correlate with that observed for c-myc, 
indicating that the c-m>'c gene is not part of the amplicon that 
involves the WISP-I locus. 

We next examined whether the WISP genes were amplified 
in a panel of 25 primary human colon adenocarcinomas. The 
relative WISP gene copy number in each colon tumor DNA 
was compared with pooled normal DNA from 10 donors by 
quantitative PGR (Fig. 6). The copy number of WISP-I and 
WISP'2 was significantly greater than one, approximately 
2-fold for WISP-I in about 60% of the tumors and 2- to 4-fold 
for WISP-2 in 92% of the tumors {P < 0.001 for each). The 
copy number for WISP'3 was indistinguishable from one (P - 
0.166). In addition, the copy number of WISP-2 was signifi- 
cantly higher than that of WISP-I {P < 0.001). 

The levels of WISP transcripts in RNA isolated from 19 
adenocarcinomas and their matched normal mucosa were 
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FiG. 5. Amplification of WISP- 1 genomic DNA in colon cancer cell 
lines. {A) Amplification in cell line DNA was determined by quanti- 
tative PGR, (B) Southern blots containing genomic DNA (10 Mg) 
digested with EcoRl (WISP-l) or Xba] (c-myc) were hybridized with 
a 100-bp human WISP-I probe (amino acids 186-219) or a human 
c-m>'c probe (located at bp 1901-2000). The WISP and myc genes are 
detected in normal human genomic DNA after a longer film exposure. 
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Fig. 6. Genomic amplification of WJSP genes in human colon 
tumors. The relative gene copy number of the WISP genes in 25 
adenocarcinomas was assayed by quantitative PCR, by comparing 
DNA from primary human tumors with pooled DNA from 10 healthy 
donors. The data are means l: SEM from one experiment done in 
triplicate. The experiment was repeated at least three times. 

assessed by quantitative PCR (Fig. 7). The level of WISP-l 
RNA present in tumor tissue varied but was significantly 
increased (2- to >25-fold) in 84% (16/19) of the human colon 
tumors examined compared with normal adjacent mucosa. 
Four of 19 tumors showed greater than 10- fold overexpression. 
In contrast, in 79% (15/19) of the tumors examined, WISP'2 
RNA expression was significantly lower in the tumor than the 
mucosa. Similar to WISP- 1, WISP-S RNA was overexpressed in 
63% (12/19) of the colon tumors compared with the normal 



z 
B 



lU 

< 
z 

cc 
£ 

0. 

CO 

5 



0.1 

0.01 

0.001 

0 
100 













WlSP-1 


.J 


n 


n n 


















WISP-2 


II 1 




liir 


'"1 * 




■|' 


liiii. 


WlSP-3 

II. ll 








f 







as 3i 94 97 ta 93 159 3fl tZO 1fr4 14B i»4 210 213 ZM W 30 2lS 78 
M B1 B? B3 B2 83 B} B3 B8 B3 82 Cl CZ C2 C3 0 0 0 0 

Patient «/Oukes Stage 

FiG. 7. WISP RNA expression in primary human colon tumors 
relative to expression in normal mucosa from the same patient. 
Expression of WISP mRNA in 19 adenocarcinomas was assayed by 
quantitative PCR, The Dukes stage of the tumor is listed under the 
sample number. The data are means ± SEM from one experiment 
done in triplicate. The experiment was repeated at least twice. 



mucosa. The amount of overexpression of WISPS ranged from 
4- to >40-fold. 



DISCUSSION 

One approach to understanding the molecular basis of cancer 
is to identify differences in gene expression between cancer 
cells and normal cells. Strategies based on assumptions that 
steady-state mRNA levels will differ between normal and 
malignant cells have been used to clone differentially ex- 
pressed genes (31). We have used a PCR-based selection 
strategy, SSH, to identify genes selectively expressed in 
C57MG mouse mammary epithelial cells transformed by 
Wnt-1. 

Three of the genes isolated, WISP-l, WISP-2, and WISP-3, 
are members of the CCN family of growth factors, which 
includes CTGF, Cyr61, and nov, a family not previously linked 
to Wnl signaling. 

Two independent experimental systems demonstrated that 
WISP induction was associated with the expression of Wnt-1. 
The first was C57MG ceils infected with a Wnt-1 retroviral 
vector or C57MG cells expressing Wnt-1 under the control of 
a tetracyline-repressible promoter, and the second was in 
Wnt-1 transgenic mice, where breast tissue expresses Wnt-1, 
whereas normal breast tissue does not. No WISP RNA expres- 
sion was detected in mammary tumors induced by polyoma 
virus middle T antigen (data not shown). These data suggest 
a link between Wnt-1 and WISPs in that in these two situations, 
WISP induction was correlated with Wnt-1 expression. 

It is not clear whether the WISPs are directly or indirectly 
induced by the downstream components of the Wnt-1 signaling 
pathway (i.e., p-catenin-TCF-l/Lefl). The increased levels of 
WISP RNA were measured in Wnt-1 -transformed cells, hours 
or days after Wnt-1 transformation. Thus, WISP expression 
could result from Wnt-1 signaling directly through p-catenin 
transcription factor regulation or alternatively through Wnt-1 
signaling turning on a transcription factor, which in turn 
regulates WISPs. 

The WISPs define an additional subfamily of the CCN family 
of growth factors. One striking difference observed in the 
protein sequence of WlSP-2 is the absence of a CT domain, 
which is present in CTGF, Cyr61, nov, WlSP-1, and WISP-3. 
This domain is thought to be involved in receptor binding and 
dimerization. Growth factors, such as TGF-^, platelet-derived 
growth factor, and nerve growth factor, which contain a cystine 
knot motif exist as dimers (32). It is tempting to speculate that 
WlSP-1 and WISP-3 may exist as dimers, whereas WISP-2 
exists as a monomer. If the CT domain is also important for 
receptor binding, WISP-2 may bind its receptor through a 
different region of the molecule than the other CCN family 
members. No specific receptors have been identified for CTGF 
or nov. A recent report has shown that integrin ayfo serves as 
an adhesion receptor for Cyr61 (33). 

The strong expression of WISP- 1 and WISP-2 in cells lying 
within the fibrovascular tumor stroma in breast tumors from 
Wnt-1 transgenic animals is consistent with previous obser- 
vations that transcripts for the related CTGF gene are pri- 
marily expressed in the fibrous stroma of mammary tumors 
(34). Epithelial cells are thought to control the proliferation of 
connective tissue stroma in mammary tumors by a cascade of 
growth factor signals similar to that controlling connective 
• tissue formation during wound repair. It has been proposed 
that mammary tumor cells or inflammatory cells at the tumor 
interstitial interface secrete TGF-^l, which is the stimulus for 
stromal proliferation (34). TGF-j31 is secreted by a large 
percentage of malignant breast tumors and may be one of the 
growth factors that stimulates the production of CTGF and 
WISPs in the stroma. 

It was of interest that WISP- 1 and WISP-2 expression was 
observed in the stromal cells that surrounded the tumor cells 
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(epithelial cells) in the Wn't-1 transgenic mouse sections of 
breast tissue. This finding suggests that paracrine signaling 
could occur in which the stromal cells could supply WISP-1 and 
WISP-2 to regulate tumor cell growth on the WISP extracel- 
lular matrix. Stromal cell-derived factors in the extracellular 
matrix have been postulated to play a role in tumor cell 
migration and proliferation (35). The localization of WlSP-l 
and WlSP-2 in the stromal cells of breast tumors supports this 
paracrine model. 

An analysis of WISP- 1 gene amplification and expression in 
human colon tumors showed a correlation between DNA 
amplification and overexpression, whereas overexpression of 
WISP'3 RNA was seen in the absence of DNA amplification. 
In contrast, ]VISP'2 DNA was amplified in the colon tumors, 
but its mRNA expression was significantly reduced in the 
majority of tumors compared with the expression in normal 
colonic mucosa from the same patient. The gene for human 
WJSP-^ was localized to chromosome 20ql2-20ql3, at a region 
frequently amplified and associated with poor prognosis in 
node negative breast cancer and many colon cancers, suggest- 
ing the existence of one or more oncogenes at this locus 
(36-38). Because the center of the 20ql3 amplicon has not yet 
been identified, it is possible that the apparent amplification 
observed for WJSP'2 may be caused by another gene in this 
amplicon. 

A recent manuscript on rCop-1, the rat orthologue of 
WISP'2, describes the loss of expression of this gene after cell 
transformation, suggesting it may be a negative regulator of 
growth in cell lines (16). Although the mechanism by which 
WISP-2 RNA expression is down-regulated during malignant 
transformation is unknown, the reduced expression of WISP-2 
in colon tumors and cell lines suggests that it may function as 
a tumor suppressor. These results show that the WISP genes 
are aberrantly expressed in colon cancer and suggest that their 
altered expression may confer selective growth advantage to 
the tumor. 

Members of the Wnt signaling pathway have been impli- 
cated in the pathogenesis of colon cancer, breast cancer, and 
melanoma, including the tumor suppressor gene adenomatous 
polyposis coli and /3-catenin (39). Mutations in specific regions 
of either gene can cause the stabilization and accumulation of 
cytoplasmic p-catenin, which presumably contributes to hu- 
man carcinogenesis through the activation of target genes such 
as the WISPs. Although the mechanism by which Wnt-1 
transforms cells and induces tumorigenesis is unknown, the 
identification of WISPs as genes that may be regulated down- 
stream of Wnl-1 in C57MG cells suggests they could be 
important mediators of Wnl-1 transformation. The amplifica- 
tion and altered expression patterns of the WISPs in human 
colon tumors may indicate an important role for these genes 
in tumor development. . 
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methods. Peptides AENK or AEQK w? re dissolved in water, made isotonic with 
NaCl and diluted into RPMl growth medium. T-cell -proliferation assays were 
done essentially as described^"-^'. Briefly, after antigen pulsing t30^gmr' 
TTCF) with letrapeptides (l-2mgmr'), PBMCs or EBV-B cells were 
washed in PBS and fixed for 45 s in 0.05% glutaraldehyde. Glycine was added 
to a final concentration of 0.1 M and the cells were washed five times in RPMI 
1640 medium containing 1% FCS before co-culture with T-cell clones in 
round-bottom 96-weU microtitre plates. After 48 h. the cultures were pulsed 
with 1 M-Ci of ^H-thymidine and harvested for scintillation counting 16 h later. 
Predigestion of native TTCF was done by incubating 200 p.g TTCF with 0.25 M-g 
pig kidney legumain in 500 ^lI 50 mM citrate buffer, pH 5.5. for 1 h at 37 °C. 
Glycopeptide digestions. The peptides HIDNEEDI. HlDN(N-glucosamine) 
EEDl and HIDNESDI, which are based on the TTCF sequence, and 
QQQHLFGSNVTDCSGNFCLFR(KKK), which is based on human transferrin, 
were obtained by custom synthesis. The three C-terminal lysine residues were 
added to the natural sequence to aid solubility. The transferrin glycopeptide 
QQQHLFGSNVTDCSGNFCLFR was prepared by tryptic (Promega) digestion 
of 5mg reduced, carboxy- methylated human transferrin followed by 
concanavalin A chromatography". Glycopeptides corresponding to residues 
622-642 and 421-452 were isolated by reverse-phase HPLC and identified by 
mass spectrometry and N-terminal sequencing. The lyophilized transferrin - 
derived peptides were redissolved in 50 mM sodium acetate, pH 5.5, 10 mM 
dithiothreitol, 20% methanol. Digestions were performed for 3 h at 30 "C with 
5-50 mU ml"' pig kidney legumain or B-ceU AEP. Products were analysed by 
HPLC or MALDI-TOF mass spectrometry using a matrix of lOmgrnl"* a- 
cyanocinnamic acid in 50% acetonitrile/0.1% TFA and a PerSeptive Biosystems 
Elite STR mass spectrometer set to linear or reflector mode. Internal standar- 
dization was obtained with a matra ion of 568.13 mass units. 
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Fas ligand (FasL) is produced by activated T cells and natural 
killer cells and it induces apoptosis (programmed cell death) in 
target cells through the death receptor Fas/Apol/CD95 (ref. 1). 
One important role of FasL and Fas is to mediate immune- 
cytotoxic killing of cells that are potentially harmful to the 
organism, such as virus-infected or tumour cells'. Here we 
report the discovery of a soluble decoy receptor, termed decoy 
receptor 3 (DcR3), that binds to FasL and inhibits FasL-induced 
apoptosis. The DcR3 gene was amplified in about half of 35 
primary lung and colon tumouFi studied, and DcR3 messenger 
RNA was expressed in malignant tissue. Thus, certain tumours 
may escape FasL-dependent immune-cytotoxic attack by expres- 
sing a decoy receptor that blocks FasL. 

By searching expressed sequence tag (EST) databases, we identi- 
fied a set of related ESTs that showed homology to the tumour 
necrosis factor (TNF) receptor (TNFR) gene superfamily^. Using 
the overlapping sequence, we isolated a previously unknown full- 
length complementary DNA from human fetal lung. We named the 
protein encoded by this cDNA decoy receptor 3 (DcR3). The cDNA 
encodes a 300-amino-acid polypeptide that resembles members of 
the TNFR family (Fig. la): the amino terminus contains a leader 
sequence, which is followed by four tandem cysteine-rich domains 
(CRDs). Like one other TNFR homologue, osteoprotegerin (OPG)\ 
DcR3 lacks an apparent transmembrane sequence, which indicates 
that it may be a secreted, rather than a membrane-asscociated, 
molecule. We expressed a recombinant, histidine-tagged form of 
DcR3 in mammalian cells; DcR3 was secreted into the cell culture 
medium, and migrated on polyacrylamide gels as a protein of 
relative molecular mass 35,000 (data not shown). DcR3 shares 
sequence identity in particular with OPG (31%) and TNFR2 
(29%), and has relatively less homology with Fas (17%). All of 
the cysteines in the four CRDs of DcR3 and OPG are conserved; 
however, the carboxy- terminal portion of DcR3 is 101 residues 
shorter. 

We analysed expression of DcR3 mRNA in human tissues by 
northern blotting (Fig. lb). We detected a predominant 1.2-kilobase 
transcript in fetal lung, brain, and liver, and in adult spleen, colon 
and lung. In addition, we observed relatively high DcR3 mRNA 
e3tpression in the human colon carcinoma cell line SW480. 

To investigate potential ligand interactions of DcR3, we generated 
a recombinant, Fc-tagged DcR3 protein. We tested binding of 
DcR3-Fc to human 293 cells transfected with individual TNF- 
family ligands, which are expressed as type 2 transmembrane 
proteins (these transmembrane proteins have their N termini in 
the cytosol). DcR3-Fc showed a significant increase in binding to 
cells transfected with FasL* (Fig. 2a), but not to cells transfected with 
TNFS Apo2L/TRAIL'^ Apo3L/TWEAK"'\ or OPGUTRANCE/ 
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RANKL"*"'^ (data not shown), DcR3-Fc immunoprecipitated shed 
FasL from FasL-transfected 293 cells (Fig. 2b) and purified soluble 
FasL (Fig. 2c), as did the Fc-tagged ectodomain of Fas but not 
TNFRl. Gel-fihration chromatography showed that DcR3-Fc and 
soluble FasL formed a stable complex (Fig. 2d). Equilibrium 
analysis indicated that DcR3-Fc and Fas-Fc bound to soluble 
FasL with a comparable affinity (K^ — 0.8 ± 0.2 and 
l.liO.lnM, respectively; Fig. 2e), and that DcR3-Fc could 
block nearly all of the binding of soluble FasL to Fas-Fc (Fig. 2e, 
inset). Thus, DcR3 competes with Fas for binding to FasL. 

To determine whether binding of DcR3 inhibits FasL activity, we 
tested the effect of DcR3-Fc on apoptosis induction by soluble 
FasL in' Jurkat T leukaemia cells, which express Fas (Fig. 3a). DcR3- 
Fc and Fas-Fc blocked soluble-FasL-induced apoptosis in a 
similar dose-dependent manner, with half-maximal inhibition at 
-^0.1 p-gml"'. Time-course analysis showed that the inhibition did 
not merely delay cell death, but rather persisted for at least 24 hours 
(Fig. 3b). We also tested the effect of DcR3-Fc on activation- 
induced cell death (AICD) of mature T lymphocytes, a FasL- 
dependent process*. Consistent with previous results' ^ activation 
of interleukin- 2 -stimulated CD4-positive T cells with anti-CD3 
antibody increased the level of apoptosis twofold, and Fas-Fc 
blocked this effect substantially (Fig. 3c); DcR3-Fc blocked the 
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induction of apoptosis to a similar extent. Thus, DcR3 binding 
blocks apoptosis induction by FasL. 

FasL-induced apoptosis is important in elimination of virus- 
infected cells and cancer cells by natural killer cells and cytotoxic T 
lymphocytes; an alternative mechanism involves perforin and 
granzymes' *^"'*'. Peripheral blood natural killer cells triggered 
marked cell death in Jurkat T leukaemia cells (Fig. 3d); DcR3-Fc 
and Fas-Fc each reduced killing of target cells from —65% to 
—30%, with half-maximal inhibition at — lixgrnl''; the residual 
killing was probably mediated by the perforin/granzyme pathway. 
Thus, DcR3 binding blocks FasL-dependent natural killer cell 
activity. Higher DcR3-Fc and Fas-Fc concentrations were required 
to block natural killer cell activity compared with those required to 
block soluble FasL activity, which is consistent with the greater 
potency of membrane-associated FasL compared with soluble 
FasL'^ 

Given the role of immune-cytotoxic cells in elimination of 
tumour cells and the fact, that DcR3 can act as an inhibitor of 
FasL, we proposed that DcR3 expression might contribute to the 
ability of some tumours to escape immune-cytotoxic attack. As 
genomic amplification frequently contributes to tumori genesis, we 
investigated whether the DcR3 gene is amplified in cancer. We 
analysed DcR3 gene -copy number by quantitative polymerase chain 



Figure 1 Primary structure and expression of human DcR3. a. Alignment of the 
amino-acid sequences of DcR3 and of osteoprotegerin (OPG); the C-termlnal lOl 
residues of OPG are not shown. The putative signal cleavage site (arrow), the 
cysteine-rich domains (CRD i -4), and the A/-linked glycosylation site (asterisk) are 
shown, b. Expression of DcR3 mRNA. Northern hybridization analysis was done 
using the DcR3 cDNA as a probe and blots of poly(A)* RNA (Clontech) from 
human fetal and adult tissues or cancer cell lines. P8L peripheral blood 
lymphocyte. 
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Figure 2 interaction of OcR3 with FasL. a, 293 cells were transfected with pRK5 
vector (top) or with pRK5 encoding full-length FasL (bottom), incubated with 
DcR3-Fc (solid line, shaded area), TNFRl -Fc (dotted line) or buffer control 
(dashed line) (the dashed and dotted lines overlap), and analysed for binding by 
FAGS. Statistical analysts showed a significant difference (P < 0.001 ) between the 
binding of DcR3-Fc to cells transfected with FasL or pRK5. PE, phycoerythr in- 
labelled celts, b, 293 cells were transfected as in a and metabolically labelled, and 
cell supernatants were immunoprecipitated with Fc-tagged TNFRl. DcR3 or Fas. 
c, Purified soluble FasL (sFasL) was immunoprecipitated with TNFRl -Fc, DcRS- 
Fc or Fas-Fc and visualized by immunobiot with anti-FasL antibody. sFasL was 
loaded directly for comparison in the right-hand lane. d. Flag-tagged sFasL was 
incubated with DcR3-Fc or with buffer and resolved by gel filtration; column 
fractions were analysed in an assay that detects complexes containing DcR3-Fc 
and sFasL-Flag. e, Equilibrium binding of DcR3-Fc or Fas-Fc to sFasL-Flag. 
Inset, competition of DcR3-Fc with Fas-Fc for binding to sFasL-Flag. 
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reaction (PGR)'" in genomic DNA from 35 primary lung and colon 
tumours, relative to pooled genomic DNA from peripheral blood 
leukocytes (PBLs) of 10 healthy donors. Eight of 18 lung tumours 
and 9 of 17 colon tumours showed DcR3 gene amplification, 
ranging from 2- to 18-fold (Fig. 4a, b). To confirm this result, we 
analysed the colon tumour DNAs with three more, independent sets 
of DcR3-based PGR primers and probes; we observed nearly the 
same amplification (data not shown). 

We then analysed DcR3 mRNA expression in primary tumour 
tissue sections by in situ hybridization. We detected DcR3 expres- 
sion in 6 out of 15 lung tumours, 2 out of 2 colon tumours, 2 out of 5 
breast tumours, and 1 out of 1 gastric tumour (data not shown). A 
section through a squamous-cell carcinoma of the lung is shown in 
Fig. 4c. DcR3 mRNA was localized to infiltrating malignant epithe- 
lium, but was essentially absent fi-om adjacent stroma, indicating 
tumour-specific expression. Although the individual tumour speci- 
mens that we analysed for mRNA expression and gene amplification 
were different, the in situ hybridization results are consistent with 
the finding that the DcR3 gene is amplified frequently in tumours. 
SW480 colon carcinoma cells, which showed abundant DcR3 
mRNA expression (Fig. lb), also had marked DcR3 gene amplifica- 
tion, as shown by quantitative PGR (fourfold) and by Southern blot 
hybridization (fivefold) (data not shown). 

If DcR3 amplification in cancer is functionally relevant, then 
DcR3 should be amplified more than neighbouring genomic 
regions that are not important for tumour survival. To test this, 



we mapped the human DcR3 gene by radiation-hybrid analysis; 
DcR3 showed liiokage to marker AFM218xe7 (T160), which maps to 
chromosome position 20ql3. Next, we isolated firom a bacterial 
artificial chromosome (BAG) library a human genomic clone that 
carries DcR3, and sequenced the ends of the clone's insert. We then 
determined, from the nine colon tumours that showed twofold or 
greater amplification of DcR3, the copy number of the DcR3- 
flanking sequences (reverse and forward) from the BAG, and of 
seven genomic markers that span chromosome 20 (Fig. 4d). The 
DcR3 -linked reverse marker showed an average amplification of 
roughly threefold; slightly less than the approximately fourfold 
amplification of DcR3; the other markers showed little or no 
amplification. These data indicate that DcR3 may be at the 'epi- 
centre* of a distal chromosome 20 region that is aniplified in colon 
cancer, consistent with the possibility that DcR3 amplification 
promotes tumour survival. 

Our results show that DcR3 binds specifically to FasL and inhibits 
FasL activity. We did not detect DcR3 binding to several other TNF- 
ligand- family members; however, this does not rule out the possi- 
bility that DcR3 interacts with other ligands, as do some other 
TNFR family members, including OPG^*'". 

FasL is important in regulating the immune response; however, 
little is known about how FasL function is controlled. One mechan- 
ism involves the molecule cFLIP, which modulates apoptosis signal- 
ling downstream of Fas^°. A second mechanism involves proteolytic 
shedding of FasL from the cell surface''. DcR3 competes with Fas for 
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Figure 3 Inhibition of FasL activity by DcR3. a. Human Jurkat T leukaemia cells 
were incubated with Rag-tagged soluble FasL (sFasL. 5 ng ml"') oligomerized 
with anti-Flag antibody (0.1 H-gmf"') ir» the presence of the proposed inhibitors 
DcR3-Fc. Fas-Fc or human IgGi arid assayed for apoptosis (mean ± s.e.m. of 
triplicates), b, Jurkat ceils were incubated with sFasL-Flag plus anti-Flag antibody 
as in a. in presence of i ^g ml*' DcR3-Fc (tilled circles). Fas-Fc (open circles) or 
human IgGl (triangles), and apoptosis was determined at the indicated time 
points, c. Peripheral blood T cells were stimulated with PHA and interteukin-2. 
followed by control (white bars) or anti-CD3 antibody (filled bars), together with 
phosphate-buffered saline (PBS), human IgGl, Fas-Fc. or DcR3-Fc (10 >ig ml''). 
After 16 h, apoptosis of CD4* cells was determined (mean ± s.e.m. of results from 
five donors), d. Peripheral blood natural killer cells were incubated with ^'Cr- 
labelted Jurkat cells in the presence of DcR3-Fc (filled circles). Fas-Fc (open 
circles) or human IgGl (triangles), and target-cell death was determined by 
release of ^'Cr (mean ± s.d. for two donors, each in triplicate). 
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Figure 4 Genomic amplihcation of DcR3 in tumours, a, Lung cancers, comprising 
eight adenocarcinomas (c. d, f. g, h, j, k, r). seven squamous-cell carcinomas (a, e. 
m, n. o. p. q), one non-small-celi carcinoma (b). one small-cell carcinoma (i). and 
one bronchial adenocarcinoma (I). The data are means i s.d. of 2 experiments 
done in duplicate, b. Colon tumours, comprising 17 adenocarcinomas. Data are 
means ± s.e.m. of five experiments done in duplicate, c. In situ hybridization 
analysis of DcR3 mRNA expression in a squamous-cell carcinoma of the lung. A 
representative bright-field image (left) and the corresponding dark-field image 
(right) show DcR3 mRNA over infiltrating malignant epithelium (arrowheads). 
Adjacent non-malignant stroma (S), blood vessel (V) and necrotic tumour tissue 
(N) are also shown, d. Average amplification of DcR3 compared with amplihca- 
tion of neighbouring genomic regions (reverse and forward. Rev and Fwd), the 
DcR3-linked marker T160. and other chromosome-20 markers, in the nine colon 
tumours showing DcR3 amplification of twofold or more (b). Data are from two 
experiments done in duplicate. Asterisk indicates P < 0,01 for a Student's f-test 
comparing each marker with DcR3. 
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FasL binding; hence, it may rjepresent a third mechanism of 
extracellular regulation of FasL activity. A decoy receptor that 
modulates the function of the cytokine interleukin-1 has been 
described^'. In addition, two decoy receptors that belong to the 
TNFR family, DcRl and DcR2, regulate the FasL-related apoptosis- 
inducing molecule Apo2L". Unlike DcRI and DcR2, which are 
membrane-associated proteins, DcR3 is directly secreted into the 
extracellular space. One other secreted TNFR-family member is 
0?G\ which shares greater sequence homology with DcR3 (31%) 
than do DcRl (17%) or DcR2 (19%); OPG functions as a third 
decoy for Apo2L". Thus, DcR3 and OPG define a new subset of 
TNFR-family members that function as secreted decoys to mod- 
ulate ligands that induce apoptosis. Pox viruses produce soluble 
TNFR homologues that neutralize specific TNF-family ligands, 
thereby modulating the antiviral immune response^ Our results 
indicate that a similar mechanism, namely, production of a soluble 
decoy receptor for FasL, may contribute to immune evasion by 
certain tumours. D 

Methods 

Isolation of DcR3 cDNA. Several overlapping ESTs in GenBank (accession 
numbers AA025672, AA025673 and W67560) and in Lifeseq™ (Incyte 
Pharmaceuticals; accession numbers 1339238. 1533571, 1533650. 1542861. 
1789372 and 2207027) showed similarity to members of the TNFR family. We 
screened human cDNA libraries by PGR with primers based on the region of 
EST consensus; fetal lung was positive for a product of the expected size. By 
hybridization to a PCR-generated probe based on the ESTs, one positive clone 
(DNA30942) was identified. V^en searching for potential alternatively spliced 
forms of DcR3 that might encode a transmembrane protein, we isolated 50 
more clones; the coding regions of these clones were identical in size to that of 
the initial clone (data not shown). 

Fc-fuslon proteins (immunoadhesins). The entire DcR3 sequence, or the 
ectodomain of Fas or TNFRl, was fiised to the hinge and Fc region of human 
IgGl, expressed in insect SF9 cells or in human 293 cells, and purified as 
described". 

Fluorescence-activated cell sorting (FACS) analysis. We transfected 293 
cells using calcium phosphate or Effectene (Qiagen) with pRK5 vector or pRK5 
encoding fuU-lenglh human FasL* (2 ^lg), together with pRK5 encoding GrmA 
(2^ig) to prevent cell death. After 16 h, the cells were incubated with 
biotinylated DcR3-F.c or TNFRl -Fc and then with phycoerythrin-conjugated 
streptavidin (GibcoBRL). and were assayed by FAGS. The data were analysed by 
Kolmogorov-Smirnov statistical analysis. There was some detecuble staining 
of vector-transfected ceDs by DcR3-Fc; as these cells express little FasL (data 
not shown), it is possible that DcR3 recognized some other factor that is 
expressed constitutively on 293 cells. 

Immunoprecipitation. Human 293 cells were transfected as above, and 
metabolically labelled with l"S]cysteine and ["S] methionine (0.5 mCi; 
Amersham). After 16 h of culture in the presence of z-VAD-fmk (10 ^iM), 
the medium was immunoprecipitated with DcR3-Fc, Fas-Fc or TNFRl-Fc 
(5M.g), foUowed by protein A-Sepharose (Repiigen). The precipitates were 
resolved by SDS-PAGE and visualized on a phosphorimager (Fuji BAS2000). 
Alternatively, purified. Flag-tagged soluble FasL (1 pLg) (Alexis) was incubated 
with each Fc-fusion protein (1 M-g). precipitated with protein A-Sepharose, 
resolved by SDS-PAGE and visualized by immunoblotiing with rabbit anti- 
FasL antibody (Oncogene Research). 

Analysis of complex formation. Flag-tagged soluble FasL (25 ^g) was 
incubated with buffer or with DcR3-Fc (40 ^g) for 1 .5 h at 24 X. The reaction 
was loaded onto a Superdex 200 HR 10/30 column (Pharmacia) and developed 
with PBS; 0.6-ml fractions were coUected. The presence of DcR3-Fc-FasL 
complex in each fraction was analysed by placing 100 \l\ aliquots into microtitre 
wells precoated with anti-human IgG (Boehringer) to capture DcR3-Fc, 
followed by detection with biotinylated anti-Flag antibody Bio M2 (Kodak) and 
streptavidin-hbrseradish peroxidase (Amersham). Calibration of the column 
indicated an apparent relative molecular mass of the complex of 420K (data not 
shown), which is consistent with a stoichiometry of two DcR3-Fc homodimers 
to two soluble FasL homotrimers. 

Equilibrium binding analysis. Microtitre wells were coated with anti-human 



IgG, blocked with 2% BSA in PBS. DcR3-Fc or Fas-Fc was added, followed by 
serially diluted Flag-tagged soluble FasL. Bound ligand was detected with anti- 
Flag antibody as above. In the competition assay, Fas-Fc was immobilized as 
above, and the wells were blocked with excess IgGl before addition of Flag- 
tagged soluble FasL plus DcR3-Fc. 

T-cell AlCD. CD3* lymphocytes were isolated from peripheral blood of 
individual donors using anti-CD3 magnetic beads (Miltenyi Biotech), 
stimulated with phytohaemagglutinin (PHA; 2 p-g ml"') for 24 h. and cultured 
in the presence of interleukin-2 (100 U mP') for 5 days. The cells were plated in 
wells coated with anti-GD3 antibody (Pharmingen) and analysed for apoptosis 
16 h later. by FACS analysis of annexin-V-binding of CD4* cells". 
Natural killer cell activity. Natural killer cells were isolated from peripheral 
blood of individual donors using anti-GD56 magnetic beads (Miltenyi 
Biotech), and incubated for 16 h with ^'Cr-loaded Jurkat cells at an effector- 
to-target ratio of 1:1 in the presence of DcR3-Fc, Fas-Fc or human IgGl. 
Target-cell death was determined by release of *'Cr in effector- target co- 
cultures relative to release of *'Gr by detergent lysis of equal numbers of Jurkat 
cells. 

Gene-amplification analysis. Surgical specimens were provided by J. Kern 
(lung tumours) and P. Quirke (colon tumours). Genomic DNA was extracted 
(Qiagen) and the concentration was determined using Hoechst dye 33258 
intercalation fluorometry. Amplification was determined by quaintitative PGR'" 
using a TaqMan instrument ( ABl). The method was validated by comparison of 
PGR and Southern hybridization data for the Myc and HER-2 oncogenes (data 
not shown). Gene-specific primers and fluorogenic probes were designed on 
the basis of the sequence of DcR3 or of nearby regions identified on a BAG 
carrying the human DcR3 gene; alternatively, primers and probes were based 
on Stanford Human Genome Center marker AFM218xe7 (T160), which is 
linked to DcR3 (likelihood score = 5.4), SHGG-36268 (T159). the nearest 
available marker which maps to -500 kilobases fi"om T160. and five extra 
markers that span chromosome 20. The DcR3-specific primer sequences were 
5'-CTTCTTCGGGGACGGTG-3' and 5'-ATGAGGGCGGGACCAG-3' and the 
fluorogenic probe sequence was 5'-(FAM-ACACGATGGGTGGTCGAAGGAG 
AAp-(TAMARA), where FAM is 5' -fluorescein phosphoramidite. Relative 
gene-copy numbers were derived using the formula 2*^^\ where ACT is the 
difference in amplification cycles required to detect DcR3 in peripheral blood 
lymphocyte DNA compared to test DNA. 
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ABC transporters (also known as traffic ATPases) form a large 
family of proteins responsible for the translocation of a variety 
of compounds across membranes of both prokaryotes and 
eukaryotes*. The recently completed Escherichia coU genome 
sequence revealed that the largest family of paralogous £ coU 
proteins is composed of ABC transporters^ Many eukaryotic 
proteins of medical significance belong to this family, such as 
the cystic fibrosis transmembrane conductance regulator (CFTR), 
the P-glycoprotein (or multidrug-resistance protein) and the 
heterodimeric transporter associated with antigen processing 
(Tapl -Tap2). Here we report the crystal structure at 1,5 A resolu- 
tion of HisP, the ATP-binding subunit of the histidine permease, 
which is an ABC transporter from Salmonella typhimurium. We 
correlate the details of this structure with the biochemical, genetic 
and biophysical properties of the wild-type and several mutant 
HisP proteins. The structure provides a basis for understanding 
properties of ABC transporters and of defective CFTR proteins. 

ABC transporters contain four structural domains: two nucleo- 
tide-binding domains (NBDs). which are highly conserved 
throughout the family, and two transmembrane domains'. In 
prokaryotes these domains are often separate subunits which are 
assembled into a membrane-bound complex; in eukaryotes the 
domains are generally fused into a single polypeptide chain. The 
periplasmic histidine permease of S. typhimurium and £. co/i'''"* is a 
well-characterized ABC transporter that is a good model for this 
superfamily. It consists of a membrane-bound complex, HisQMP2, 
which comprises integral membrane subunits, HisQ and HisM, and 
two copies of HisP, the ATP-binding subunit. HisP, which has 
properties intermediate between those of integral and peripheral 
membrane proteins', is accessible from both sides of the membrane, 
presumably by its interaction with HisQ and HisM*. The two HisP 
subunits form a dimer, as shown by their cooperativity in ATP 
hydrolysis^ the requirement for both subunits to be present for 
activity*, and the formation of a HisP dimer upon chemical cross- 
linking. Soluble HisP also forms a dimer^, HisP has been purified 
and characterized in an active soluble form^ which can be recon- 
stituted into a fully active membrane-bound complex*. 

The overall shape of the crystal structure of the HisP monomer is 
that of an 'U with two thick arms (arm I and arm II); the ATP- 
binding pocket is near the end of arm I (Fig. 1). A six-stranded P- 
sheet 03 and PS-P 12) spans both arms of the L, with a domain of a 
a- plus P-type structure (Pl, P2, P4-P7, al and a2) on one side 
(within arm I) and a domain of mostly a-helices (a3-a9) on the 
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Figure 1 Crystal structure of HisP. a. View of the dimer along an axis 
perpendicular to its two-fold axis. The top and bottom of the dimer are suggested 
to face towards the periplasmic and cytoplasmic sides, respectively (see text). 
The thickness of arm II is about 25 A, comparable to that of membrane. a-Helices 
are shown in orange and p-sheets in green, b. View along the two-fold axis of the 
HisP dimer. showing the relative displacement of the monomers not apparent in 
a. The p-strands at the dimer interface are labelled, c, View of one monomer from 
the bottom of arm I. as shown in a, towards arm II, showing the ATP-binding 
poclcet. a-c. The protein and the bound ATP are in 'ribbon' and 'ball-and-stick' 
representations, respectively. Key residues discussed in the text are indicated in 
c. These figures were prepared with MOLSCRIPT^. N, amino terminus; C. C 
terminus. 
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Gene amplirication is a common event in the progression of 
human cancers, and amplified oncogenes have been show/n to 
have diagnostic, prognostic and therapeutic relevance. A 
kinetic quantitative polymerase-chain-reaction (PCR) method, 
based on fluorescent TaqWIan methodology and a new instru- 
ment (ABl Prism 7700 Sequence Detection System) capable 
of measuring fluorescence in real-time, vtfas used to quantify 
gene ampliflcation in tumor DNA. Reactions are character- 
ized by the point during cycling when PCR ampliflcation is still 
In the exponential phase, rather than the amount of PCR 
product accumulated after a fixed number of cycles. None of 
the reaction components is limited during the exponential 
phase, meaning that values are highly reproducible in reac- 
tions starting with the same copy number. This greatly 
improves the precision of DNA quantification. Moreover, 
real-time PCR does not require post-PCR sample handling, 
thereby preventing potential PCR-product carry-over con- 
tamination; it possesses a wide dynamic range of quantifica- 
tion and results in much faster and higher sample throughput. 
The real-time PCR method, was used to develop and validate 
a simple and rapid assay for the detection and quantification 
of the 3 most frequently amplified genes (myc, ccna^ and 
erbB2) in breast tumors. Extra copies of myc, ccndl and erbB2 
were observed in 10, 23 and 15%, respectively, of 108 breast- 
tumor DNA; the largest observed numbers of gene copies 
were 4.6, 18.6 and 15.1, respectively. These results correlated 
well with those of Southern blotting. The use of this new 
semi-automated technique will make molecular analysis of 
human cancers simpler and more reliable, and should find 
broad applications in clinical and research settings, int. J. 
Cancer 78:661-666, 1998. 
© J998 miey-Liss. Inc. 

Gene amplification plays an important role in the pathogenesis 
of various solid tumors, including breast cancer, probably because 
over-expression of the amplified target genes confers a selective 
advantage. The first technique used to detect genomic amplification 
was cytogenetic analysis. Amplification of several chromosome 
regions, visualized either as extrachromosomal double minutes 
(dmins) or as integrated homogeneously staining regions (HSRs), 
are among the main visible cytogenetic abnormalities in breast 
tumors. Other techniques such as comparative genomic hybridiza- 
tion (CGH) (Kallioniemi ef al., 1994) have also been used in broad 
searches for regions of increased DNA copy numbers in tumor 
cells, and have revealed some 20 amplified chromosome regions in 
breast tumors. Positional cloning efforts are underway to identify 
the critical gene(s) in each amplified region. To date, genes known 
to be amplified frequently in breast cancers include myc (8q24), 
ccnd\ ( 1 1 q 1 3), and erbBl ( 1 7q 1 2-q2 1 ) (for review, see B ieche and 
Lidereau, 1995). 

Amplification of the myc, ccndl, and erb'Bl proto-oncogenes 
should have clinical relevance in breast cancer, since independent 
studies have shown that these alterations can be used to identify 
sub-populations with a worse prognosis (Bems et al, 1992; 
Schuuring et al, 1992; Siamon et aL 1987). Muss et al (1994) 
suggested that these gene alterations may also be useful for the 
prediction and assessment of the efficacy of adjuvant chemotherapy 
and hormone therapy. 

However, published results diverge both in terms of the fre- 
quency of these alterations and their clinical value. For instance, 
over 500 studies in 10 years have failed to resolve the controversy 



surrounding the link suggested by Slamon et al (1987) between 
erbbl amplification and disease progression. These discrepancies 
are partly due to the clinical, histological and ethnic heterogeneity 
of breast cancer, but technical considerations are also probably 
involved. 

Specific genes (DNA) were initially quantified in tumor cells by 
means of blotting procedures such as Southern and slot blotting. 
These batch techniques require large amounts of DNA (5-10 
^g/reaction) to yield reliable quantitative results. Furthermore, 
meticulous care is required at all stages of the procedures to 
generate blots of sufficient quality for reliable dosage analysis. 
Recently, PCR has proven to be a powerful tool for quantitative 
DNA analysis, especially with minimal starting quantities of tumor 
samples (small, early-stage tumors and formalin-fixed, paraffin- 
embedded tissues). 

Quantitative PCR can be performed by evaluating the amount of 
product either after a given number of cycles (end-point quantita- 
tive PCR) or after a varying number of cycles during the 
exponential phase (kinetic quantitative PCR). In the first case, an 
internal standard distinct from the target molecule is required to 
ascertain PCR efficiency. The method is relatively easy but implies 
generating, quantifying and storing an internal standard for each 
gene studied. Nevertheless, it is the most frequently applied 
method to date. 

One of the major advantages of the kinetic method is its rapidity 
in quantifying a new gene, since no internal standard is required (an 
external standard curve is sufficient). Moreover, the kinetic method 
has a wide dynamic range (at least 5 orders of magnitude), giving 
an accurate value for samples differing in their copy number. 
Unfortunately, the method is cumbersome and has therefore been 
rarely used. It involves aliquot sampling of each assay mix at 
regular intervals and quantifying, for each aliquot, the amplifica- 
tion product. Interest in the kinetic method has been stimulated by a 
novel approach using fluorescent TaqMan methodology and a new 
instrument (ABI Prism 7700 Sequence Detection System) capable 
of measuring fluorescence in real time (Gibson et al, 1996; Heid et 
al, 1996). The TaqMan reaction is based on the 5' nuclease assay 
first described by Holland et al (1991). The latter uses the S' 
nuclease activity of Taq polymerase to cleave a specific fiuorogenic 
oligonucleotide probe during the extension phase of PCR. The 
approach uses dual-labeled fiuorogenic hybridization probes (Lee 
et al, 1993). One fluorescent dye, co-valently linked to the 5' end 
of the oligonucleotide, serves as a reporter [FAM (i.e., 6-carboxy- 
fluorescein)] and its emission spectrum is quenched by a second 
fluorescent dye, TAMRA {i.e., 6-carboxy-telramethyl-rhodamine) 
attached to the 3' end. During the extension phase of the PCR 
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cycle, the fluorescent hybridization probe is hydrolyzed by the 
5'-3' nucleolytic activity of DNA polymerase. Nuclease degrada- 
tion of the probe releases the quenching of FAM fluorescence 
emission, resulting in an increase in peak fluorescence emission. 
The fluorescence signal is normalized by dividing the emission 
intensity of the reporter dye (FAM) by the emission intensity of a 
reference dye (i.e., ROX, 6-carboxy-X-rhodamine) included in 
TaqMan buffer, to obtain a ratio defined as the Rn (normalized 
reporter) for a given reaction tube. The use of a sequence detector 
enables the fluorescence spectra of all 96 wells of the thermal 
cycler to be measured continuously during PGR amplification. 

The real-time PGR method ofl"ers several advantages over other 
current quantitative PGR methods (Celi et ai, 1994): (i) the 
probe-based homogeneous assay provides a real-time method for 
detecting only specific amplification products, since specific hybri- 
dation of both the primers and the probe is necessary to generate a 
signal; (ii) the Q (threshold cycle) value used for quantification is 
measured when PGR amplification is still in the log phase of PGR 
product accumulation. This is the main reason why G, is a more 
reliable measure of the starting copy number than are end-point 
measurements, in which a slight difference in a limiting component 
can have a drastic effect on the amount of product; (Hi) use of 
values gives a wider dynamic range (at least 5 orders of magni- 
tude), reducing the need for serial dilution; (iv) The real-time PGR 
method is run in a closed-tube system and requires no post-PGR 
sample handling, thus avoiding potential contamination; (v) the 
system is highly automated, since the instrument continuously 
measures fluorescence in all 96 wells of the thermal cycler during 
PGR amplification and the corresponding software processes, and 
analyzes the fluorescence data; (vi) the assay is rapid, as results are 
available just one minute after thermal cycling is complete; (vii) the 
sample throughput of the method is high, since 96 reactions can be 
analyzed in 2 hr. 

Here, we applied this semi-automated procedure to determine 
the copy numbers of the 3 most frequently amplified genes in breast 
tumors {myc, ccndl and erbhl), as well as 2 genes {alb and app) 
located in a chromosome region in which no genetic changes have 
been observed in breast tumors. The results for 108 breast tumors 
were compared with previous Southem-blot data for the same 
samples. 

MATERIAL AND METHODS 
Tumor and blood samples 

Samples were obtained from 1 08 primary breast tumors removed 
surgically from patients at the Gentre Rene Huguenin; none of the 
patients had undergone radiotherapy or chemotherapy. Immedi- 
ately after surgery, the tumor samples were placed in liquid 
nitrogen until extraction of high-molecular-weight DNA. Patients 
were included in this study if the tumor sample used for DNA 
preparation contained more than 60% of tumor cells (histological 
analysis). A blood sample was also taken from 1 8 of the same 
patients. 

DNA was extracted from tumor tissue and blood leukocytes 
according to standard methods. 

Real-time PCR 

Theoretical basis. Reactions are characterized by the point 
during cycling when amplification of the PGR product is first 
delected, rather than by the amount of PGR product accumulated 
after a fixed number of cycles. The higher the starting copy number 
of the genomic DNA target, the earlier a significant increase in 
fluorescence is observed. The parameter G, (threshold cycle) is 
defined as the fractional cycle number at which the fluorescence 
generated by cleavage of the probe passes a fixed threshold above 
baseline. The target gene copy number in unknown samples is 
quantified by measuring G, and by using a standard curve to 
determine the starting copy number. The precise amount of 
genomic DNA (based on optical density) and its quality (;.e., lack 



of extensive degradation) are both difficult to assess. We therefore 
also quantified a control gene (alb) mapping to chromosome region 
4qll-ql3. in which no genetic alterations have been found in 
breast-tumor DNA by means of CGH (Kallioniemi et ai, 1 994). 

Thus, the ratio of the copy number of the target gene to the copy 
number of the alb gene normalizes the amount and quality of 
genomic DNA. The ratio defining the level of amplification is 
termed "N", and is determined as follows: 

copy number of target gene {app. myc, ccndl, erbB2) 

N = *~ — ~ — ' ' ■ . 

copy number of reference gene {alb) 

Primers, probes, reference human genomic DNA and PCR 
consumables. Primers and probes were chosen with the assistance 
of the computer programs Oligo 4.0 (National Biosciences, Ply- 
mouth, MM), EuGene Paniben Systems, Gincinnati, OH) and Primer 
Express (Perkin-Elmer Applied Biosystems, Foster City, GA). 

Primers were purchased from DN Agency (Malvern, PA) and 
probes from Perkin-Elmer Applied Biosysiems. 

Nucleotide sequences for the oligonucleotide hybridization 
probes and primers are available on request. 

The TaqMan PGR Gore reagent kit, Micro Amp optical tubes, 
and Micro Amp caps were from Perkin-Elmer Applied Biosystems. 

Standard-curve construction. The kinetic method requires a 
standard curve. The latter was constructed with serial dilutions of 
specific PGR products, according to Piatak et al. (1993). In 
practice, each specific PGR product was obtained by amplifying 20 
ng of a standard human genomic DNA (Boehringer, Mannheim, 
Germany) with the same primer pairs as those used later for 
real-time quantitative PGR. The 5 PGR products were purified 
using MicroSpin S-400 HR columns (Pharmacia, Uppsala, Swe- 
den) electrophorezed through an acrylamide gel and stained with 
ethidium bromide to check their quality. The PGR products were 
then quantified specirophotometrically and pooled, and serially 
diluted 10-fold in mouse genomic DNA (Glontech, Palo Alto, GA) 
at a constant concentration of 2 ng/^l. The standard curve used for 
real-time quantitative PGR was based on serial dilutions of the pool 
of PGR products ranging fi-om lO''' (10^ copies of each gene) to 
10-10 (IQ2 copies). This series of diluted PGR products was 
aliquoted and stored at - SO'^G until use. 

The standard curve was validated by analyzing 2 known 
quantities of calibrator human genomic DNA (20 ng and 50 ng). 

PCR amplification. Amplification mixes (50 ^il) contained the 
sample DNA (around 20 ng, around 6600 copies of disomic genes), 
lOX TaqMan buffer (5 ^1), 200 ^M dATP, dGTP, dOTP, and 400 
HM dUTP, 5 mM MgGl:, 1.25 units of AmpliTaq Gold, 0.5 units of 
AmpErase uracil N-glycosylase (UNG), 200 nM each primer and 
100 nM probe. The thermal cycling conditions comprised 2 min at 
50*C and 1 0 min at 95'*G. Thermal cycling consisted of 40 cycles at 
95°G for 15 s and 65°G for 1 min. Each assay included: a standard 
curve (from 10* to 10^ copies) in duplicate, a no-template control, 
20 ng and 50 ng of calibrator human genomic DNA (Boehringer) in 
triplicate, and about 20 ng of unknown genomic DNA in triplicate 
(26 samples can thus be analyzed on a 96-well microplate). AH 
samples with a coefficient of variation (GV) higher than 10% were 
retested. 

All reactions were performed in the ABI Prism 7700 Sequence 
Detection System (Perkin-Elmer Applied Biosystems), which 
detects the signal from the fluorogenic probe during PGR. 

Equipment for real-time detection. The 7700 system has a 
built-in thermal cycler and a laser directed via fiber optical cables 
to each of the 96 sample wells. A charge-coupled-device (GDD) 
camera collects the emission from each sample and the data are 
analyzed automatically. The software accompanying the 7700 
system calculates Q and determines the starting copy number in the 
samples. 



w t m 



GENE AMPLIFICATION BY REAL-TIME PCR 



663 



Determination of gene amplification. Gene amplification was 
calculated as described above. Only samples with an N value 
higher than 2 were considered to be amplified. 

RESULTS 

To validate the method, real-time PCR was performed on 
genomic DNA extracted from 108 primary breast tumors, and 18 
normal leukocyte DNA samples from some of the same patients. 
The target genes were the myc, ccndl and erbB2 proto-oncogenes, 
and the p-amyloid precursor protein gene {app), which maps to a 
chromosome region (21q21 .2) in which no genetic alterations have 
been found in breast tumors (Kallioniemi et al., 1994). The 
reference disomic gene was the albumin gene (alb, chromosome 
4qll-ql3). 



Validation of the standard curve and dynamic range 
of real-time PCR 

The standard curve was constructed from PCR products serially 
diluted in genomic mouse DNA at a constant concentration of 
2 ng/^1. It should be noted that the 5 primer pairs chosen to analyze 
the 5 target genes do not amplify genomic mouse DNA (data not 
shown). Figure 1 shows the real-time PCR standard curve for the 
alb gene. The dynamic range was wide (at least 4 orders of 
magnitude), with samples containing as few as 10- copies or as 
many as 1 0^ copies. 

Copy-number ratio of the 2 reference genes fapp and albj 

The app to alb copy-number ratio was determined in 1 8 normal 
leukocyte DNA samples and all 108 primary breast-tumor DNA 
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Figure 1 - Albumin {alb) gene dosage by real-time PCR. Top: Amplification plots for reactions with staning alb gene copy number ranging 
from 10^ (A9), 10^ (A7), 10^ (A4) to 10* (A2) and a no-template control (Al). Cycle number is plotted vs. change in normalized reporter signal 
(ARn). For each reaction tube, the fluorescence signal of the reporter dye (FAM) is divided by the fluorescence signal of the passive reference dye 
(ROX), to obtain a ratio defined as the normalized reporter signal (Rn). ARn represents the normalized reporter signal (Rn) minus the baseline 
signal established in the first 15 PCR cycles. ARn increases during PCR as alb PCR product copy number increases until the reaction reaches a 
plateau. C, (threshold cycle) represents the fractional cycle number at which a significant increase in Rn above a baseline signal (horizontal black 
line) can first be detected. Two replicate plots were performed for each standard sample, but the data for only one are shown here. Bottom: 
Standard curve plotting log starting copy number vs. C, (threshold cycle). The black dots represent the data for standard samples plotted in 
duplicate and the red dots the data for unknown genomic DNA samples plotted in triplicate. The standard curve shows 4 orders of linear dynamic 
range. 
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samples. We selected these 2 genes because they are located in 2 
chromosome regions {app, 21q21.2; alb, 4qll-ql3) in which no 
obvious genetic changes (including gains or losses) have been 
observed in breast cancers (Kallioniemi et al, 1994). The ratio for 
the 18 normal leukocyte DNA samples fell between 0.7 and 1.3 
(mean 1.02 ± 0.21), and was similar for the 108 primary breast- 
tumor DNA samples (0,6 to 1 .6, mean 1 .06 ± 0.25), confirming 
that alb and app are appropriate reference disomic genes for 
breast-tumor DNA. The low range of the ratios also confirmed that 
the nucleotide sequences chosen for the primers and probes were 
not polymorphic, as mismatches of their primers or probes with the 
subject's DNA would have resulted in differential amplification. 

myc, ccndl and erb52 gene dose in normal leukocyte DNA 

To determine the cut-off point for gene amplification in breast- 
cancer tissue, 18 normal leukocyte DNA samples were tested for 
the gene dose (N), calculated as described in **Material and 
Methods". The N value of these samples ranged from 0.5 to 1.3 
(mean 0.84 ± 0.22) for mvc; 0.7 to 1.6 (mean 1.06 ± 0.23) for 
ccndl and 0.6 to 1.3 (mean'0.91 ± 0.19) for erZ?B2. Since N values 
for myc, ccndl and erbBl in normal leukocyte DNA consistently 
fell between 0.5 and 1 .6, values of 2 or more were considered to 
represent gene amplification in tumor DNA. 

myc, ccndl and erbB2 gene dose in breast-tumor DNA 

myc ccndl and erbB2 gene copy numbers in the 108 primary 
breast tumors are reported in Table I. Extra copies of ccndl were 
more frequent (23%. 25/108) than extra copies oi erb^l (15%, 
16/108) and myc (10%, 11/108), and ranged from 2 to 18.6 for 
ccndl, 2 to 15.1 for erbBl, and only 2 to 4.6 for the myc gene. 
Figure 2 and Table II represent tumors in which the ccndl gene was 
amplified 16-fold (TI45), 6-fold (T133) and non-amplified (Til 8), 
The 3 genes were never found to be co-amplified in the same tumor. 
erbBl and ccndl were co-amplified in only 3 cases, myc and ccndl 
in 2 cases and myc and erbBl in 1 case. This favors the hypothesis 
that gene amplifications are independent events in breast cancer. 
Interestingly, 5 tumors showed a decrease of at least 50% in the 
erbBl copy number (N < 0.5), suggesting that they bore deletions 
of the 1 7q21 region (the site of erbBl). No such decrease in copy 
number was observed with the other 2 proto-oncogenes. 

^Comparison of gene dose determined by real-time quantitative 
PCR and Southern-blot analysis 

Southern-blot analysis of myc, ccndl and erbBl amplifications 
had previously been done on the same 1 08 primary breast tumors. A 
perfect correlation between the results of real-time PCR and 
Southern blot was obtained for tumors with high copy numbers 
(N ^ 5). However, there were cases (1 myc, 6 ccndl and 4 erbBl) 
in which real-time PCR showed gene amplification whereas 
Southern-blot did not, but these were mainly cases with low extra 
copy numbers (N from 2 to 2.9). 

DISCUSSION 

The clinical applications of gene amplification assays are 
currently limited, but would certainly increase if a simple, standard- 
ized and rapid method were perfected. Gene amplification status 
has been studied mainly by means of Southern blotting, but this 
method is not sensitive enough to detect low-level gene amplifica- 
tion nor accurate enough to quantify the full range of amplification 
values. Southern blotting is also time-consuming, uses radioactive 



TABLE I - DISTRIBUTION OF AMPLIFICATION LEVEL (N) FOR myc. 
ccndl AND erhB2 GENES IN 108 HUMAN BREAST TUMORS 



Gene 




Amplification level (K) 




<0.5 


0.5-1.9 2-4.9 




myc 


0 


97 (89.8%) 11 (10.2%) 


0 


ccndl 


0 


83 (76.9%) 17(15.7%) 


8 (7.4%) 


erbBl 


5 (4.6%) 


87 (80.6%) 8 (7.4%) 


8 (7.4%) 



reagents and ' requires relatively large amounts of high-quality 
genomic DNA, which means it cannot be used routinely in many 
laboratories. An amplification step is therefore required to deter- 
mine the copy number of a given target gene from minimal 
quantities of tumor DNA (small eady-siage tiimors, cyiopuncture 
specimens or formalin-fixed, paraffin-embedded tissues). 

In this study, we validated a PCR method developed for the 
quantification of gene over-representation in tumors. The method, 
based on real-time analysis of PCR amplification, has several 
advantages over other PCR-based quantitative assays such as 
competitive quantitative PCR (Celi et al., 1 994), First, the real-time 
PCR method is performed in a closed-tube system, avoiding the 
risk of contamination by amplified products. Re-amplification of 
carryover PCR products in subsequent experiments can also be 
prevented by using the enzyme uracil N-glycosylase (UNG) 
(Longo et al, 1990). The second advantage is the simplicity and 
rapidity of sample analysis, since no posl-PCR manipulations are 
required. Our results show that the automated method is reliable. 
We found it possible to determine, in triplicate, the number of 
copies of a target gene in more than 100 tumors per day. Third, the 
system has a linear dynamic range of at least 4 orders of magnitude, 
meaning that samples do not have to contain equal starting amounts 
of DNA. This technique should therefore be suitable for analyzing 
formalin-fixed, paraffin-embedded tissues. Fourth, and above all, 
real-time PCR makes DNA quantification much more precise and 
reproducible, since it is based on C, values rather than end-point 
measurement of the amount of accumulated PCR product. Indeed, 
the ABI Prism 7700 Sequence Detection System enables C, to be 
calculated when PCR amplification is still in the exponential phase 
and when none of the reaction components is rate-limiting. The 
within-run CV of the C, value for calibrator human DNA (5 
replicates) was always below 5%, and the between-assay precision 
in 5 different runs was always below 10% (data not shown). In 
addition, the use of a standard curve is not absolutely necessary, 
since the copy number can be determined simply by comparing the 
Q ratio of the target gene with that of reference genes. The results 
obtained by the 2 methods (with and without a standard curve) are 
similar in our experiments (data not shown). Moreover, unlike 
competitive quantitative PCR, real-time PCR does not require an 
internal control (the design and storage of intemal controls and the 
validation of their amplification efficiency is laborious). 

The only potential disavaniage of real-time PCR, like all other 
PCR-based methods and solid-matrix blotting techniques (South- 
ern blots and dot blots) is that is cannot avoid dilution artifacts 
inherent in the extraction of DNA from tumor ceils contained in ^ 
heterogeneous tissue specimens. Only FISH and inununohistochem- 
istry can measure alterations on a cell-by-cell basis (Pauletti et al, 
1996; Slamon et al, 1989). However, FISH requires expensive 
equipment and trained personnel and is also time-consuming. 
Moreover, FISH does not assess gene expression and therefore 
cannot detect cases in which the gene product is over-expressed in 
the absence of gene amplification, which will be possible in the 
future by real-time quantitative RT-PCR. Immunohistochemistry is 
subject to considerable variations in the hands of different teams, 
owing to alterations of target proteins during the procedure, the 
different primary antibodies and fixation methods used and the 
criteria used to define positive staining. 

The results of this study are in agreement with those reported in 
the literature. (/) Chromosome regions 4qll-ql3 and 2lq21.2 
(which bear alb and app, respectively) showed no genetic alter- 
ations in the breast-cancer samples studied here, in keeping with 
the results of CGH (Kallioniemi et al, 1994). (//) Wc found that 
amplifications of these 3 oncogenes were independent events, as 
reported by other teams (Bems et al, 1 992; Borg et al, 1 992). (;//) 
The frequency and degree of myc amplification in pur breast tumor 
DNA series were lower than those of ccndl and erbBl amplifica- 
tion, confirming the findings of Borg et al (1 992) and Couijal et al 
(1997). {iv) The maxima of ccndl and erbBl over-representation 
were 1 8-fold and 1 5-fold, also in keeping with earlier results (about 
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I — Samples 



Cycle 



H FAM * A8 
El FAM - El 2 
12 FAM - Gil 
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-Samples 



IS FAM • B4 
IS FAM • C6 
g] FAM - C8 
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Cycle 



Tumor 
■ T118 
m T133 
m T145 



CCND1 



Ct Copy number 
27.3 4605 
23.2 



61659 



22.1 



125892 



ALB 



Ct Copy number 
26.5 4365 
25.2 10092 



25.6 



7762 



Figure 2 - ccndJ and alb gene dosage by real-time PCR in 3 breast tumor samples: Tl 1 8 (E 1 2, C6, black squares), Tl 33 (G 1 1, B4, red squares) 
and T145 (A8, C8, blue squares). Given the C, of each sample, the initial copy number is inferred from the standard curve obtained during the same 
experiment. Triplicate plots were performed for each tumor sample, but the data for only one are shown here. The results are shown in Table 11. 



30-fold maximum) (Bemse/fl/., 1992; Borg e/ a/.. 1992; Courjal e/ 
a!., 1997). (v) The erbB2 copy numbers obtained with real-time 
PCR were in good agreement with data obtained with other 
quantitative PCR-based assays in terms of the frequency and 
degree of amplification (An et al., 1995; Deng et ai, 1996; Valeron 



et ai, 1996). Our results also correlate well with those recently 
published by Gelmini et al (1997), who used the TaqMan system to 
measure erbhl amplification in a small series of breast tumors 
(n = 25), but with an instrument (LS-50B luminescence spectrom- 
eter, Perkin-Elmer Applied Biosystems) which only allows end- 
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TABLE II - EXAMPLES OF ccndl GENE DOSAGE RESULTS 
FROM 3 BREAST TUMORS' 



Tumor 




ccndl 






alb 




HccndUalb 


Copy 
number 


Mean 


SD 


Copy 
number 


Mean 


SD 


T118 


4525 






4223 










4605 


4603 


77 


4365 


4325 


89 


1.06 




4678 






4387 








T133 


59821 






9787 










61659 


61100 


1111 


10092 


10137 


375 


6.03 




61821 






10533 








T145 


128563 






7321 










125892 


125392 


3448 


7762 


7672 


316 


16.34 




121722 






7933 









'For each sample, 3 replicate experiments were performed and the mean 
and the standard deviation (SD) was determined. The level of ccndl gene 
amplification (^ccndUalb) is determined by dividing the average ccndl 
copy namber value by the average alb copy number value. 



point measurement of fluorescence intensity. Here we report myc 
and ccnd! gene dosage in breast cancer by means of quantitative 
PCR. (vi) We found a high degree of concordance between 
real-time quantitative PCR and Southern blot analysis in terms of 
gene amplification, especially for samples with high copy numbers 
(> 5-fold). The slightly higher frequency of gene amplification 
(especially ccndJ and erbBl) observed by means of real-time 
quantitative PCR as compared with Southern-blot analysis may be 
explained by the higher sensitivity of the former method. However, 
we cannot rule out the possibility that some tumors with a few extra 



gene copies observed in real-time PCR had additional copies of an 
arm or a whole chromosome (trisomy, tetrasomy or polysomy) 
rather than true gene amplification. These 2 types of genetic 
alteration (polysomy and gene amplification) could be easily 
distinguished in the future by using an additional probe located on 
the same chromosome arm, but some distance from the target gene. 
It is noteworthy that high gene copy numbers have the greatest 
prognostic significance in breast carcinoma (Borg et al, 1992; 
Slamone/fl/.. 1987). 

Finally, this technique can be applied to the detection of gene 
deletion as well as gene amplification. Indeed, we found a 
decreased copy number of er6B2 (but not of the other 2 proio- 
oncogenes) in several tumors; eri?B2 is located in a chromosome 
region (17q21) reported to contain both deletions and amplifica- 
tions in breast cancer (Bieche and Lidereau, 1995). 

In conclusion, gene amplification in various cancers can be used 
as a marker of pre-neoplasia, also for early diagnosis of cancer, 
staging, prognostication and choice of treatment. Southern blotting 
is not sufficiently sensitive, and FISH is lengthy and complex. 
Real-time quantitative PCR overcomes both these limitations, and 
is a sensitive and accurate method of analyzing large numbers of 
samples in a short time. It should find a place in routine clinical 
gene dosage. 
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Genome-wide Study of Gene Copy Numbers, 
Transcripts, and Protein Levels in Pairs of 
Non-invasive and Invasive Human Transitional 
Cell Carcinomas* 

Torben F. 0mtoftt§, Thomas ThykjaerU, Frederic M. Waldman|l, Hans Wolf-, 
and Julio E. Celistt 



Gain and loss of chromosomal material is characteristic 
of bladder cancer, as well as malignant transformation in 
general. The consequences of these changes at both the 
transcription and translation levels Is at present unknown 
partly because of technical limitations. Here we have at- 
tempted to address this question In pairs of non-invasive 
and invasive human bladder tumors using a combination 
of technology that included comparative genomic hybrid- 
ization, high density oligonucleotide array-based monitor- 
ing of transcript levels (5600 genes), and high resolution 



phenomenon at both the transcription and translation levels. 
High throughput array studies of the breast cancer cell line 
BT474 has suggested that there is a correlation between 
DNA copy numbers and gene expression in highly amplified 
areas (2). and studies of individual genes in solid tumors 
have revealed a good conrelation between gene dose and 
mRNA or protein levels in the case of c-erb-B2. cyc//n dh 
emsl, and N-myc (3-5). However, a high cyclin D1 protein 
expression has been observed without simultaneous am- 



ing of transcript levels (5600 genes), and high resolution jj^^^^^j^^ ^nd a low level of c-myc copy number in- 
two-dimensional gel electrophoresls/The results showed^^^^^^ observed without concomitant c-myc protein 
that there is a gene dosage effect that in some cases 



superimposes on other regulatory mechanisms. This ef- 
fect depended (p < 0.015) on the magnHude of the com- 
parative genomic hybridization change. In general (18 of 
23 cases), chromosomal areas with more than 2-fold gam 
of DNA showed a corresponding increase in mRI^ tran- 
scripts. Areas with loss of DNA, on the other hand, 
showed either reduced or unaltered transcript levels^) Be- 
cause most proteins resolved by two-dimensional gels 
are unknown it was only possible to compare mRNA and 
protein alterations in relatively few cases of well focused 
abundant proteins. %ith few exceptions we found a good 
correlation (p < 0.005) between transcript alterations and 
protein levels. The implications, as well as limitations, 
of the approach are discussed, Moleculsr & Cellular 
Proteomlcs 1:37-45, 2002. 



Aneuploidy is a common feature of most human cancers 
(1), but little is known about the genome-wide effect of this 
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overexpression (6). 

In human bladder tumors, karyotyping, fluorescent In sttu 
hybridization, and comparative genomic hybridization (CGH) 
have revealed chromosomal aberrations that seem to be 
characteristic of certain stages of disease progression. In the 
case of non-invasive pTa transitional cell carcinomas (TCCs), 
this includes loss of chromosome 9 or parts of it. as well as 
loss of Y in males. In minimally invasive pTI TCCs, the fol- 
lowing alterations have been reported: 2q-. 11 p-. 1q+» 
11q13+, 17q+. and 20q+ (7-12), It has been suggested that 
these regions harbor tumor suppressor genes and onco- 
genes; however, the large chromosomal areas involved often 
contain many genes, making meaningful predictions of the 
functional consequences of losses and gains very difficult. 

In this investigation we have combined genome-wide tech- 
nology for detecting genomic gains and losses (CGH) with 
gene expression profiling techniques (microarrays and pro- 
teomics) to determine the effect of gene copy number on 
transcript and protein levels in pairs of non-Invasive and in- 
vasive human bladder TCCs. 

EXPERIMENTAL PROCEDURES 
MaterlaZ-Bladder tumor biopsies were sampled after informed 
consent was obtained and after removal of tissue f^ "i:;^"^^!^^^ 
ogy examination. By light microscopy tumors wer^ 
staged by an experienced pathotogist as pTa (superficial paplilary). 



^ The abbreviations used are: CGH. comparative genomic hybnd- 
ization- TCC, transitional ceil carcinoma; LOH. toss of heterozygo^ 
PA-FABP. psoriasis-associated fatty acid-binding protein: 2D. 
two-dimensional. 
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ex^:^sston leva, o^specmc genes, and ove«.. -P;^'""^^!,^;^^^^^^^ 827 compared with the norv*we«ve 

c^pared with the non-invasive W^-* J'"""!,^^^,^ L^en tumCr Dl^ and nom^ 

counterpart tumor SSS.Theavemge fluorescent signal rat«^^ surrounded by thin curves indicaMng one st^ 

(tern The bold curve in the ratio profile represents a mean f™*^";^ ^ the vert/cal lines next to it (dotted) Indicate a ratio ot 
£Un The centra/ vertfca/ line "^^^^If'-^^^^t^^^^^^^ showed alterations In content ^^^^^^ 

0 5 Uem and 2.0 WghO- 1" chromosomes where the npn-invasiye t""^™ represents one gene ea<^ identified l^y the 

p^rof ma. ^mo'some shown to the ^ ^l^^^^.^u^U^J^Z^Z^^ bsrs indicate the purported lo«^n <J 
Lning number above the bars (the name «>' tlLTaslve tumor compared with the non-invasive <=«'"»«^>^°^; 
the g.me. and the co/ors indicate the the ferri&/i<. entitled Express/on shows ^J^"^^^^ 

IncrLe (WacA)>2-told decrease (b/ue).nos^nif.cart^^ up-regulated (f>/ac«). at °' 

in expression atong the chromosome; the cotors Indwale ^^^^.'T'*^ -ene was absent in one of the samples and Present m 
i,::;^u.ated or more than ^^^J^J^ "o^ 3 wS^l^ STcorresponded to one standard deviation U, a doubte 



grade I and II. respectively. tumo« 733 and 827 were staged ^ pTI 
pnvasive into submucosa). 733 was staged as sol«J. and 827 was 

v^ere embedded Immediately In a so<«"^-9"^'«?'^y'? J^^'^^^f^e 
^lution and stored at -80 'C. Total RNA was Vl* 
rn3 B RNA isoiation method (WAK-ChemIe Medical GMBH). 
RfStilolated by an oDgoCdT) selection step (Ol.gotex 

"XS^Sl-l of mRNA was used as starting material. 
Thf^iidS^strand cDNA synthesis was perfom^ using the 

uTs Instructions but using an oligo(dn primer ^^^1^^^^ 
polymerase binding site. Ubeled cRNA was ^^fj^^^l"^^ 
G/^p® in vitro transcription kit (AmWon). BK>t.n-labeled CTP and 



UTP (Enzo) was used, together with unlabeled NTPs In the reartion. 
^St^Tn^ tr^ription reaction, the unln.«|po«ted nu- 
cleotides we« removed using RNeasy columns^.ag«^ 

Arav Hybridization and Scanning- fvny hybndization and scan- 
ninT^'as^ned from a previous method (13). ^^'if IS ^ 
fra^msniBd at 94 'C for 36 min in buffer containing 40 rtiM Tris 
S 81 100 mL KOAc. 30 mM MgOAc. Prior to hybridization. 
rfSm^mel'^ATnaex SS^'-^ ^^^'^^^^^r^^'^^^^ 
10 mM Tris dH 7.6, 0.005% Triton), was heated to 95 C for 5 mm. 
^TilSiy ccifed to 40 -C. and loaded onto^e At^^-x p^be 
t^y «rtridge. The probe array was then Incubated for 16hat 40 C 
ar%n1Snt rotation (60 rpm). The P'^'^'' ^Z^S^E^ 
washes In 6x SSPE-T at 25 'C followed by 4 washes in O^x SSPE-T 
rS^C The biotinylated cRNA t^x'^pS 
Shycoerythrin conjugate, 10 ^mi (Molecular Probes) m 6x SSPE-T 
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for 30 min at 25 "C followed by 1 0 washes in 6 x SSPE-T at 25 "C. The 
probe arrays were scanned at 560 nm using a conf ocal laser scanning 
microscope (made for Affymetrix by Hewlett-Packard). The readings 
from the quantitative scanning were analyzed by Affymetrix gene 
expression analysis software. 

Microsateirite /^na/ysfe— Microsalellite Analysis was performed as 
described previously <14). Microsatellites were selected by us© of 
www.ncbl.nlm.nih.gov/genemap98. and primer sequences were ob- 
tained from the genome data base at www.gdb.org. DMA was extracted 
from tumor and Wood and ampBfied by PGR in a volume of 20 /J for 35 
cycles. The amplicons were denatured and electrophoresed for 3 h in an 
ABI Prism 377. Data were collected in the Gene Scan program for 
fragment analysis. Loss of heterozygosity was defined as less than 33% 
of one allele detected in tunror amplicons compared with blood. 

Proteomic /Ana/ys/s— TCCs were mipced into small pieces and 
homogenized in a small glass homogenizer in 0.5 ml of lysis solution. 
Samples were stored at -20 *C until use. The procedure for 2D gel 
electrophoresis has been descrit)ed in detail elsewhere {15» 1 6). Gels 
were stained with silver nitrate and/or Coomassie Brilliant Blue. Pro- 
teins were Identified by a combination of procedures that Included 
microsequencing. mass spectrometry, two-dimensional gel Western 
immunoblotling, and comparison with the master tvwxiimensional gel 
Image of human keratinocyte proteins; see biobase.dk/cgi-bin/cefis. 

CGH-Hybridlzatton of differentially labeled tumor and normal DMA 
to nonnal metaphase chromosomes was performed as described 
previously (10). Ruorescein-labeled tumor DNA (200 ng). Texas Red- 



labeled reference DNA (200 ng). and human Col-1 DNA (20 tig) were 
denatured at 37 "C for 5 min and appBed to denatured normal met- 
aphase slides. Hybridization was at 37 "C for 2 days. After washing, 
the slides were counterstained with 0.15 ;ig/ml 4,6-diamidim>-2-phe- 
nylindole in an anti-fade solution. A second hyt)rkjization was per- 
formed for all tumor samples using fluoresceln-latjeled reference DNA 
and Texas Red-labeled tumor DNA (inverse labefing) to confinn the 
aberrations detected during the initial hyljridization. Each CGH ex- 
periment also included a normal control hybridization using fluores- 
cein- and Texas Red-labeled normal DNA. Digital Image analysis v^ras 
used to Identify chromosomal regions with abnormal fluorescence 
ratios, indicating regions of DNA gains and k>sses. The average 
greenrred fluorescence intensity ratio profiles were calculated using 
four images of each chromosome (eight chromosomes total) with 
normalization of the greenrred fluorescence intensity ratio for the 
entire metaphase and background conrectton. Chromosome identifi- 
cation was performed based on 4.6-diamkiino-2-phenylindole band- 
ing patterns. Only images showing unifomrj high intensity fluores- 
cence with minimal background staining were analyzed. All 
centromeres, p arms of acrocentric chromosomes, and heterochro- 
matic regions were excluded from the analysis. 

RESULTS 

Comparative Genomic Hyb/fcf&atfon— The CGH analysis 
Identified a number of chromosomal gains and losses in the 
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Tabu I 

Correlation between anerations detected by CGH and by expression monitoring 
Top CGH used as Independent variable (if CGH alteration - what expression ratio was found); l^ttom. altered expression used as 
independent variable (If expression alteration - what CGH deviation was found). 



Tumor 733 vs, 335 



Tunnor 827 vs. 532 



alterations ^^^^^^ ehange cluste;; 



Concordance CGH iterations Expression changelulM^ 



Concordance 



13 Gain 
10 Loss 



10 Up-regulation 

0 Down-regulation 

3 No change 

1 UpHregulation 

5 Down-regulation 

4 No change 



77% 
50% 



10 Gain 
12 Loss 



8 Up-regulation 
0 Dovm-regulalion 

2 No change 

3 Up-regutation 

2 Down regulation 
7 No change 



Expression change clusters cGH alterations 



Tumor 733 vs. 335 Q^^rdance Expression change clusters 



Tumor 827 vs. 532 
CGH alterations 



17% 



Concordance 



16 Up-regulation 
21 Down-regulation 
15 No change 



11 Gain 

2 Loss 

3 No change 
1 Gain 

8 Loss 

12 No change 
3 Gain 

3 Loss 

9 No change 



69% 
38% 
60% 



17 Up-regulation 
9 Down-regulation 
21 No change 



10 Gain 

5 Loss 

2 No change 

0 Gain 

3 Loss 

6 No change 

1 Gain 
3 Loss 

17 No change 



59% 
33% 
81% 



two invasive tumors (stage pT1 , TCCs 733 and 827), whereas 
the two non-invasive papillomas (stage pTa. TCCs 335 and 
532) showed only 9p-. 9q22-q33-, and X-, and 7+. 9q-. 
and Y-, respectively. Both Invasive tumors showed changes 
(1q22-2'4+. 2q14.1-qter-, 3q12-q13.3-, 6q12-q22-. 
9q34+. 11q12-q13+, 17+. and 20q11.2-q12+) that are typ- 
ical for their disease stage, as well as additional alterations, 
some of which are shown in Fig. 1. Areas with gains and 
losses deviated from the normal copy number to some extent, 
and the average numerical deviation from normal was 0.4-fold 
in the case of TCC 733 and 0.3-fold for TCC 827. The largest 
changes, amounting to at least a doubling of chromosomal 
content, were observed at 1q23 in TCC 733 (Rg. 1>A) and 
20q12 in TCC 827 (Rg, ie). 

mRNA Expression in Relation to DNA Copy Number-The 
mRNA levels from the two invasive tumors (TCCs 827 and 
733) were compared with the two non-invasive counterparte 
(TCCs 532 and 335). This was done in two separate experi- 
ments in which we compared TCCs 733 to 335 and 827 to 
532. respectively, using two different scaling settings for the 
arrays to rule out scaling as a confounding parameter. Ap- 
proximately 1,800 genes that yielded a signal on the arrays 
were searched in the Unigene and Genemap data bases for 
chromosomal location, and those with a known location 
(1096) were plotted as bars covering their purported locus. In 
that way it was possible to construct a graphic presentation of 
DNA copy number and relative mRNA levels along the Indi- 
vidual chromosomes (Fig. 1). 

For each mRNA a ratio was calculated between the level In 
the invasive versus the non-invasive counterpart. Bars, which 
represent chromosomal location of a gene, were color-coded 
according to the expression ratio, and only differences larger 



than 2-fold were regarded as informative (Rg. 1). The density 
of genes along the chromosomes varied, and areas contain- 
ing only one gene were excluded from the calculations. The 
resolution of the C?GH method is very low, and some of the 
outlier data may t>e because of the fact that the boundaries of 
the chromosomal aberrations are not known at high resolution. 

Two sets of calculations were made from the data. For the 
first set we used CGH alterations as the Independent variable 
and estimated the frequency of expression alterations in these 
chromosomal areas. In general, areas with a strong gain of 
chromosomal material contained a cluster of genes having 
increased mRNA expression. For example, both chromo- 
somes 1q21-q25. 2p and 9q. showed a relative gain of more 
than 100% in DNA copy number that was accompanied by 
increased mRNA expression levels in the two tumor pairs (Rg. 
1). In most cases, chromosomal gains detected by CGH were 
accompanied by an increased level of transcripts in both 
TCCs 733 (77%) and 827 (80%) CTable I. fop). Chromosonnal 
losses, on the other hand, were not accompanied by de- 
creased expression in several cases, and were often regis- 
tered as having unaltered RNA levels (Table I. top). The inabil- 
ity to detect RNA expression changes in these cases was not 
because of fewer genes mapping to the lost regions (data not 
shown). 

In the second set of calculations we selected expression 
alterations above 2-fold as the Independent variable and es- 
timated the frequency of CGH alterations in these areas. As 
above, we found that increased transcript expression con-e- 
lated with gain of chromosomal material (TCC 733. 69% and 
TCC 827, 59%). whereas reduced expression was often de- 
tected in areas with unaltered CGH ratios (Table I. bottom). 
Furthermore, as a control we looked at areas with no alter- 
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rwft detected 



Tumor 827 versus 532 Tumor 733 versus 335 

FK5 2 Correlation between maximum CGH aberration and the ability to detect expression chan9e by oligonucJeotide array 
rni^in?^ a^^^^^^ as a numerical -fold change In ratio between invasive tumors 827 (A) and 733 (^^)^and the,r n^^vasjve 

^rSs M^^^ 335. The expression change was taken from the ExprBSSion line to the right in Fig. 1, which depute the result ng 
^SciSge for a given chr^omal regioa At least haH of the mRNAs from a given region have to be ei^er up^^^ ^^^"f^ 
scored as In expreLon change. All chromosomal amu. in which the CGH ratio plus or minus one standard deviation was outside the 
ratio value of one were Included. 



ation in expression. No alteration v/as detected by CGH in 
most of these areas (TCC 733. 60% and TCG 827. 81 %; see 
Table I, bottom). Because the ability to observe reduced or 
increased mRNA expression clustering to a certain chromo- 
somal area clearly reflected the extent of copy number 
changes^ we plotted the maximum CGH aberrations in the 
regions shovirtng CGH changes against the ability to detect a 
change in mRNA expression as monitored by the oligonucleo- 
tide arrays (Fig. 2)k£pr both tumors TCC 733 1>< 0.015) and 
TCC 827 (p < 0,00003) a highly significant correlation was 
observed between the level of CGH ratio change (reflecting 
the DNA copy number) and alterations detected by the array 
based technology (Fig. 2^ Similar data were obtained when 
areas with altered expression were used as Independent vari- 
ables. These areas correlated best with CGH when the CGH 
ratio deviated 1 .6- to 2.0-foId (Table I. bottom) but mostly did 
not at lower CGH deviations. These data probably reflect that 
loss of an allele may only lead to a 50% reduction in expres- 
sion level, which Is at the cut-off point for detection of expres- 
sion alterations. Gain of chromosomal material can occur to a 
much larger extent. 

MicrosatelHte-based Detection of Minor Areas of Loss- 
es- In TCC 733, several chromosomal areas exhibiting DNA 
amplification were preceded or followed by areas with a nor- 
mal CGH but reduced mRNA expression (see Fig. 1 . TCC 733 
chromosome 1q32, 2p21. and 7q21 and q32. 9q34. and 
10q22). To determine whether these results were because of 
undetected toss of chromosomal nruiterial In these regions or 



because of other non-structural mechanlsnrjs regulating tran- 
scription, we examined two microsatellites positioned at chro- 
mosome 1q25-32 and two at chromosome 2p22, Loss of 
heterozygosity (LOH) was found at both 1q25 and at 2p22 
Indicating that minor deleted areas were not detected with the 
resolution of CGH (Fig. 3). Addrtlonally, chromosome 2p in 
TCC 733 showed a CGH pattern of gain/no change/gain of 
DNA that correlated with transcript increase/decrease/in- 
crease. Thus, for the areas shoving increased expression 
there was a conrelation with the DNA copy number alterations 
(Fig. 1 A). As indicated above, the mRNA decrease observed in 
the middle of the chromosomal gain was because of LOH, 
implying that one of the mechanisms for mRNA down-regu- 
lation may be regions that have undergone smaller losses of 
chromosomal material. However, this cannot be detected with 
the resolution of the CGH method. 

In both TCC 733 and TCC 827, the telomeric end of chro- 
mosome 11 p showed a normal ratio In the CGH analysis; 
however, clusters of Ave and three genes, respecth/ely, lost 
their expression. Tv^/o microsatellites (D11S1760, D11S922) 
positioned close to MUC2, IGF2, and cathepsin D indicated 
LOH as the most likely mechanism behind the loss of expres- 
sion (data not shown). 

A reduced expression of mRNA observed in TCC 733 at 
chromosomes 3q24. Ilpll. 12p12.2. 12q21.1. and 16q24 
and in TCC 827 at chromosonr>e 11p15.5, 12p11, 15q11.2, 
and 18q12 was also examined for chromosomal losses using 
microsatellites positioned as close as possible to the gene loci 
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FtG. 3. Microsatellite analysts of loss of heterozygosity. Tumor 
733 showing loss of heterozygosity at chromosome 1q25. detected 
(a) by D1S215 close to Hu class I histocompatibility antigen (gene 
number 38 In Rg. 1), (b) by D1S2735 close to cathepsin E (gene 
number 41 in Fig. 1). and (c) at chromosome 2p23 by D2S2251 close 
to general iJ-spectrin (gene number 11 on Fig. 1) and of (d) tumor 827 
showing toss of heterozygosity at chromosome 18q12 by S18S1118 
close to mitochondrial 3-oxoacyl-coenzyme A thiolase (gene nurnber 
12 in Fig. 1). The upper curves show the electropherogram obtained 
from normal DNA from leukocytes (W), and the tower cunres show the 
electropherogram from tumor DNA (7). In all cases one allele is 
partially lost in the tumor amplicon. 

showing reduced mRNA transcripts. Only the microsatellite 
positioned at 18q12 showed LOH (Rg. 3), suggesting that 
transcriptional down-regulation of genes In the other regions 
may be controlled by other mechanisms. 

Relation between Changes in mRNA and Protein Levels-- 
2D-PAGE analysis, in combination with Coomassie Brilliant 
Blue and/or silver staining, was carried out on all four tumors 
using fresh biopsy rnaterial. 40 well resolved abundant known 
proteins migrating in areas away from the edges of the pH 



Reduced 
protein 



Unaltered 
prote^ 



Increased 
pfotem 



2*D^et 



Fig. 4. Correlation between protein levels as judged by 2D- 
PAGE and transcript ratio. For comparison proteins were divided In 
three groups, unaltered in level or up- or down-regulated {^rizonta! 
axis). The mRNA ratio as determined by oligonucleotide arrays was 
plotted for each gene {vertical axis). A. mRNAs that were scored as 
present In both tumors used for the ratio calculation; A, mRNAs that 
were scored as ^sent in the invasive tumors (along horizontal axis) or 
as absent in non-invasive reference (top of figure). Two different 
scalings were used to exclude scaling as a confounder. TCCs 827 
and 532 (AA) were scaled with Ijackground suppresston, and ICCs 
733 and 335 (•O) were scaled without suppression. Both compari- 
sons showed highly ^gnificant (p < 0.005) differences in mRNArattos 
between the groups. Proteins shown, were as follows: Group A (from 
/eft), phosphoglucomutase 1, glutathfone transferase class m number 
4, fatty acid-binding protein homotogue, cytokeratin 15, and cyto- 
keratin 13; B (from /eft), fatty acid-binding protein homologue, 28-kDa 
heat shock protein, cytokeratin 13. and calcycfin; C (from left), a-eno- 
lase, hnRNP B1. 28-kDa heat shock protein, 14-3-3-6, and 
pre-mRNA spHcIng factor. D, mesothelial keratin K7 (type II); E (from 
fop), glutathione S-transf erase-ir and mesothelial keratin K7 (type 11); 
F(from top and /eft), adenytyl cyclase-associated prolan. E-cadherin, 
keratin 19, calglzzartn, phosphoglycerate mutase, annexin IV, cy- 
toskeletal 7-actin, hnRNP A1, integral membrane protein calnexin 
(IP90), hnRNP H. brain-type clathrin light chain-a, hnRNP F. 70-kDa 
heat shock protein, heterogeneous nuclear ribonucleoprotein A/B. 
translattopally controlled tumor protein, liver gIyceraldehyde-3-phos- 
phate dehydrogenase, keratin 8. aldehyde reductase, and Na,K- 
ATPase ^-1 subunit; G, (from top and tefl), TCP20, calgizzarin, 70- 
kDa heat shock protein, calnexin. hnRNP H. cytokeratin 15, ATP 
synthase, keratin 19, triosephosphate isomerase. hnRNP F. Uver glyc- 
eraldehyde-3-phosphatase dehydrogenase, glutathione S-transler- 
ase-ir. and keratin 8; H (from /eft), plasma gelsolin, autoantigen cal- 
reticulin, thioredoxin. and NAD+-dependent 15 hydroxyprostaglandm 
dehydrogenase; / (from top), prx>lyl 4-hydroxyiase ^-subunit. cyto- 
keratin 20, cytokeratin 17, prohibition, and fwctose 1,6-biphos- 
phatase; J annexin II; K, annexin IV; L (from top and /eft). 90-kDa heat 
shock protein, prolyl 4-hydroxylase 0-subunit, a-enolase. GRP 78, 
cyclophilln, and cofitin. 

gradient, and having a known chromosomal location, were 
selected for analysis in the TCC pair 827/532. Proteins were 
identified by a combination of nnethods (see "Experimental 
Procedures^. In general there was a highly significant corre- 
lation (p < 0.006) between mRNA and protein alterations (Fig. 
4). Only one gene showed disagreement between transcript 
alteration and protein alteration. Except for a group of cyto- 
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Fk5. 5. Comparison of protein and transcript levels in Invasive 
and non-Invasive TCCs. The upper part of the figure shov^rs a 2D get 
{left) and the oligonucleotide array (rfg/if) of TCC 532. The red rectan- 
gles on the upper gei highlight the areas that are compared below. 
Identical areas of 2D gels of TCCs 532 and 827 are shown l>etow. 
Clearly, cytokeratins 13 and 15 are strongly down-regulated in TCC 
827 (red anr^otation). The tile on the array containing prol>es for 
cytokeratin 15 is enlarged below the array (red arrow) from TCC 532 
and is compared with TCC 827, The upper row of squares in each tile 
corresponds to perfect match probes; the lower row con-esponds to 
mismatch probes containing a mutation (used for conrectton for un- 
spedfic binding). Absence of signal is depicted as black, and the 
higher the signal the lighter the cotor. A high transcript level was 
detected in TCC 532 (6151 units) whereas a much lower level was 
detected in TCC 827 (absence of signals). For cytokeratin 13, a high 
transcript level was also present in TCC 632 (15659 units), and a 
much lower level was present in TCC 827 (623 units). The 2D gels at 
the bottom of the figure (fe/f) show levels of PA-FABP and adipocyte- 
FABP In TCCs 335 and 733 (Invasive), respectively. Both proteins are 
down-regulated in the invasive tumor. To the right we show the array 
tiles for the PA-FABP transcript A medium transcript level was de- 
tected in the case of TCC 335 (1277 units) whereas very low levels 
were detected in TCC 733 (166 units). /£F, isoelectric focusing. 



keratins encoded by genes on chronnosome 17 (Rg. 5} the 
analyzed proteins did not belong to a particular family. 26 well 
focused proteins whose genes had a know chromosonnal 
location were detected in TCCs 733 and 335, and of tHese 19 
correlated (p < 0.005) with the mRNA changes detected using 
the arrays (Rg. 4). For example. PA-FABP was highly ex- 
pressed in the non-Invasive TCC 335 but lost In the invasive 
counterpart (TCC 733; see Rg. 5). The smaller number of 
proteins detected In both 733 and 335 was because of the 
smaller size of the biopsies that were available. 

1 1 chromosomal regions where CGH showed aberrations 
that con-esppnded to the changes in transcript levels also 
showed con-esponding changes in the protein level (Table II). 
These regions included genes that encode proteins that are 
found to be frequently altered in bladder cancer, namely 
cytokeratins 17 and 20, annexins II and IV, and the fatty 
acid-binding proteins PA-FABP and FBP1, Four of these pro- 
teins were encoded by genes in chromosome 17q, a fre- 
quently amplified chronrwsomal area in invasive bladdisr 
cancers. 

DISCUSSION 

Most human cancers have abnormal DNA content, having 
lost some chromosomal parts and gained others. The present 
study provides some evidence as to the effect of these gains 
and losses on gene expression in two pairs of non-invasive 
and invasive TCCs using high throughput expression anrays 
and proteomics, in combination with CGH. In general, the 
results showed that there Is a dear individual regulation of the 
mRNA expression of single genes, vrtnich in some cases was 
superimposed by a DNA copy numt>er effect. In most cases, 
genes located in chromosomal areas with gains often exhib- 
ited increased mRNA expression, whereas areas showing 
losses showed either no change or a reduced mRlvIA expres- 
sion. The latter might be because of the fad that losses nr>ost 
often are restricted to loss of one allele, and the cut-off point 
for detection of expression alterations was a 2-fold change, 
thus being at the border of detection. In several cases, how- 



Tabue II 



Proteins whose expression level correlates with both mRNA and gene dose changes 


Protein 


Chromosomal location 


Tumor TCC 


CGH alteration 


Transcript alteration* 


Protein alteration 


Annexin It 
Annexln IV 
Cytokeratin 17 
Cytokeratin 20 
(PA-)FABP 
FBPI 

Plasma gelsolin 
Heat shock protein 28 
Prohibltin 
ProtyI-4-hydroxyl 
hnRNPBI 


1q21 
2p13 

17q12-q21 

17q21.1 

8q21.2 

9q22 

9q31 

15q12-q13 
17q21 
I7q25 
7pl5 


733 

733 

827 

827 

827 

827 

827 

827 
827/733 
827/733 

827 


Gain 
Gain 
Gain 
Gain 
Loss 
Gain 
Gain 
Loss 
Gain 
Gain 
Loss 


Abs to Pres* 
3.9-Fokj up 
3.e-Fokl up 

5.6- Fold up 
10-Fold down 
2.3-Fold up 
Abs to Pres 
2.5^oldup 

3.7- /2^Fold up^ 
S.7V1 .6-Fold up 
2.5-Fold down 


Increase 

Increase 

Increase 

Increase 

Decrease 

Increase 

Irxsrease 

Decrease 

Increase 

Increase 

Decrease 



'^T^TyL^etJ'^S^6^,^ alterations were found in both TCCs 827 and 733 these are shown as 827/733. 
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ever, an increase or decrease in DNA copy number was 
associated with de novo occurrence or complete loss of tran- 
script, respectively. Some of these transcripts could not be 
detected in the non-Invasive tumor but were present at rela- 
tively high levels in areas with DNA amplifications In the inva- 
sive tumors {e.g. in TCC 733 transcript from cellular ligand of 
annexin II gene (chromosome 1q21) from absent to 2670 
arbitrary units; in TCC 827 transaipt from small prohne-rich 
protein 1 gene (chromosome 1q12-q21.1) from absent to 
1326 arbitrary units). It may be anticipated from these data 
that significant clustering of genes with an Increased expres- 
sion to a certain chromosomal area indicates an increased 
likelihood of gain of chromosomal material In this area. 

Considering the many possible regulatory mechanisms act- 
ing at the level of transcription, it seems striking that the gene 
dose effects were so cleariy detectable in gained areas. One 
hypothetical explanation may lie In the loss of contro led 
methylation in tumor cells (17-19). Thus, it may be possible 
' that in chromosomes with increased DNA copy numbers two 
or more alleles could be demethylated simultaneously leading 
to a higher transcription level, whereas in chromosomes with 
losses the remaining allele could be partly methylated, turning 
off the process (20. 21). A recent report has documented a 
ploidy regulation of gene expression in yeast, but in this case all 
the genes were present in the same ratio (22). a situation that is 
not analogous to that of cancer cells, which show mari<ed 
chromosomal abenrations, as well as gene dosage effects. 

Several CGH studies of bladder cancer have shown that 
some chromosomal abenations are common at certain 
stages of disease progression, often occurring in more than 1 
of 3 tumors. In pTa tumors, these include 9p-. 9q-. 1q+. Y 
(2. 6). and in pT1 tumors. 2q-.11p-. 11q-. 1^+. 5p+. 8q+. 
17q+. and 20q+ (2-4. 6. 7). The pTa tumors studied here 
showed similar aben-ations such as 9p- and 9q22-q33- and 
9q- and Y-, respectively. Likewise, the two minimal Invasive 
pTI tumors showed aberrations that are commonly seen at 
that stage, and TCC 827 had a remari^able resemblance to the 
commonly seen pattern of losses and gains, such as 1 q22-24 
amplification (seen in both tumors). 1 1q14-q22 loss, the latter 
often linked to 17 q+ (both tumors), and lq+ and 9p-. often 
linked to 20q+ and 11 q13+ (both tumors) (7-9). Tbese ob- 
servations indicate that the pairs of tumors used in this study 
exhibit chromosomal changes observed in many tumors, and 
therefore the findings could be of general importance for 
bladder cancer. 

Considering that the mapping resolution of CGH is of about 
20 megabases it is only possible to get a cmde picture of 
chromosomal instaWlity using this technique. Occasionally, 
we observed reduced transcript levels close to or inside re- 
gions with Increased copy numbers. Analysis of these regions 
by positioning heterozygous microsatellltes as close as pos- 
sible to the locus showring reduced gene expression revealed 
loss of heterozygosity In several cases. It seems likely that 
multiple and different events occur along each chromosomal 



arm and that the use of cDNA microan-ays for analysis of DNA 
copy number changes will reach a resolution that can resolve 
these changes, as has recently been proposed (2). The outlier 
data were not more frequent at the boundaries of the CGH 
aberrations. At present we do not know the mechanism be- 
hind chromosomal aneuploidy and cannot predict whether 
chromosomal gains will be transcribed to a larger extent than 
the two native alleles. A mechanism as genetic Imprinting has 
an Impact on the expression level in normal cells and is often 
reduced in tumors. However, the relation between imprinting 
and gain of chromosdmal material Is hot known. 

We regard it as a strength of this investigation that we were 
able to compare invasive tumors to benign tumors rather than 
to normal urothelium, as the tumors studied were blologicalhr 
very close and probably may represent successive steps in 
the progression of bladder cancer. Despite the limited amount 
of fresh tissue available it was possible to apply three different 
state of the art methods. The obsewed correlation between 
DNA copy number and mRNA expression is remarioble when 
one considers that different pieces of the tumor biopsies were 
used for the different sets of experiments. This indicate that 
bladder tumors are relatively homogenous, a notion recently 
supported by CGH and LOH data that showed a remarioble 
similarity even between tumors and distant metastasis (10. 23). 

In the few oases analyzed. mRNA and protein levels 
showed a striking correspondence although in some cases 
we found discrepancies that may be attributed to translational 
regulation, post-translatlonal processing, protein degrada- 
tion or a combinatton of these. Some transcripts belong to 
undertranslated mRNA pools, which are associated with few 
translatlonally inactive ribosomes; these pods, however, 
seem to be rare (24). Protein degradation, for example, may 
be very important In the case of polypeptides with a short 
half-life (e.g. signaling proteins). A poor correlation between 
mRNA and protein levels was found in liver cells as deter- 
mined by arrays and 2D-PAGE (25). and a moderate correla- 
tion was recently reported by Ideker et al. (26) in yeast. 
Onterestingiy, our study revealed a much better correlation 
bSMeon gained chromosomal areas and Increased mRNA 
levels than between loss of chromosomal areas and reduced 
mRNA levels. In general, the level of CGH change detennlned 
the ability to detect a change in transcripC) One possible 
explanation could be that by losing one allele the change In 
mRNA level is not so dramatic as compared with gain of 
material, wrtiich can be rather unlimited and may lead to a 
severalfold increase in gene copy number resulting In a much 
higher impact on transcript level. TTie latter would be much 
easier to detect on the expression arrays as the cut-off point 
was placed at a 2-foId level so as not to be biased by noise on 
the array. Construction of arrays with a better signal to noise 
ratio may in the future allow detection of lesser than 2-fold 
alterattons in transaipt levels, a feature that may facilitate the 
analysis of the effect of loss of chromosomal areas on tran- 
script .levels. 
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In eleven cases we found a significant correlation between 
DNA copy number, mRNA expression, and protein level. Four 
of these proteins were encoded by genes located at a fre- 
quently amplified area in chromosome 17q. Whether DNA 
copy number is one of the mechanisms behind alteration of 
these eleven proteins is at present unknown and will have to 
be proved by other methods using a larger number of sam- 
ples. One factor making such studies complicated is the large 
extent of protein modification that occurs after translation, 
requiring immunoidentification and/or mass spectrometry to 
con-ectly identify the proteins in the gels. 

In conclusion, the results presented in this study exemplify 
the large body of knowledge that may be possible to gather in 
the future by combining state of the art techniques that follow 
the pathway from DNA to protein (26). Here, we used a tradi- 
tional chromosomal CGH method, but in the future high reso- 
lution CGH based on microanays with many thousand radiation 
hybrid-mapped genes will Increase the resolution and infprn^- 
tion derived from these types of experiments (2). Combined with 
expression arrays analyzing transcripts derived from genes vwth 
. known locations, and 2D gel analysis to obtain information at 
the post-translational level, a clearer and more developed un- 
derstanding of the tumor genome vflll be forthcoming. 
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ABSTRACT 

Genetic changes underlie tumor progression and may lead to canccr- 
spccific expression of critical genes. Over 1100 publications have de- 
scribed the use of comparative genomic hybridization (CGH) to analr^e 
the pattern of copy number alterations In cancer, but very few of the genes 
affected are known. Here, wc performed high-resolution CGH analysis on 
cDNA microarrays in breast cancer and directly compared copy number ^ 
and mRNA eiprcssion levels of 13,824 genes to quantitatc the impact of 
genomic changes on gene expression. We Identified and mapped the 
boundaries of 24 independent amplicons, ranging in size from 0.2 to 32 
Mb. Throughout the genome, both high- and low-level copy number 
Changes had a substantial Impact on gene expression, with 44% of the 
highly ampUfied genes showing ovcrexprcssion and 10.5% of the highly 
overexpressed genes being amplified. Statistical analysis with random 
permutation tests identified 270 genes whose expression levels across 14 
samples were svstcmaticaUy attribuUble to gene ampUfication. These 
included most previously described amplified genes in breast cancer and 
many novel targets for genomic alterations, including the H0XB7 gene, 
the presence of which in a novel amplicon at 17q213 was validated In 
10.2% of primary breast cancers and associated with poor patient prog- 
nosis. In conclusion, CGH on cDNA microarrays revealed hundreds of 
novel genes whose overexpression is attribuUble to gene amplification. 
These genes may provide insights to the clonal evolution and progression 
of breast cancer and highlight promising therapeutic targets. 

INTRODUCTION 

Gene expression patterns revealed by cDNA microarrays have 
facilitated classification of cancers into biologicaUy distinct catego- 
ries, some of which may explain the clinical behavior of the tumors 
(1-6). Despite this progress in diagnostic classification, the molecular 
mechanisms underlying gene expression patterns in cancer have re- 
mained elusive, and the utility of gene expression profiling in the 
identification of specific therapeutic targets remains limited. 

Accumulation of genetic defects is thought to underlie the clonal 
evolution of cancer. Identification of the genes that mediate the effects 
of genetic changes may be important by highlighting transcripts that 
are actively involved in tumor progression. Such transcripts and their 
encoded proteins would be ideal targets for anticancer therapies, as 
demonstrated by the clinical success of new therapies against ampli- 
fied oncogenes, such as ERBB2 and EGFR (7, 8), in breast cancer and 
other solid tumors. Besides amplifications of known oncogenes, over 
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Copy number ratio 




Expression ratio 



Fig. J. Intoact of gene copy number on global gene expression levels. A, percentage of 
over- and undcrcxprcsscd genes {Y axis) according to copy «f« ^ 

Threshold values used for over- and undercxpression were >2.I84 (global upper 7% 01 
the cDNA ratios) and <0.4826 (global lower 7% of the expression ratios^. pe«^«^ 
of amplified and deleted genes according to expression ratios. Threshold values for 
amplification and deletion were >L5 and <0.7. 



20 recurrent regions of DNA amplification have been mapped in 
breast cancer by CGH* (9, 10). However, these amplicons are often 
large and poorly defined, and their impact on gene expression remams 
unknown. 

We hypothesized that genome-wide identification of those gene 
expression changes that are attributable to imderiying gene copy 
number alterations would highlight transcripU that arc actively in- 
volved in the causation or maintenance of the malignant phenotype. 
To identify such transcripts, we applied a combination of cDNA and 
CGH microarrays to: (a) detennine the global impact that gene copy 
number variation plays in breast cancer development and progression; 
and {b) identify and characterize those genes whose mRNA expres- 



^ The abbreviations used are: CGH. comparative genomic hybridization; FISH> flucv 
rcsccnce in situ hybridization; RT-PCR. reverse transcriplton-PCR. 
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sion is most significantly associated with amplification of the corre- 
sponding genomic template. 

MATERIALS AND METHODS 

Br«ast Cancer Cell Uifes. Fourteen breast cancer cell lines (BT-20. BT- 
474. HCC1428, Hs578t, MCF7, MDA-361. MDA-436, MDA-453, MDA-468, 
SKBR-3, T-47D, UACCS12. ZR-75-1, and ZR-75-30) were obtained ftom the 
American Type Culture Collection (Manassas, VA). Cells were grown under 
recommended culture conditions. Genomic DNA and niRNA were isolated 
using standard protocols. 

Copy Number and ETpr«ssion Analyses by cDNA Mlcroarrays. The 
preparation and printing of the 13.824 cDNA clones on glass slides were 
performed as described (11-13). Of these clones, 244 represented uncharac- 
tcrized expressed sequence tags, and the remainder corresponded to known 
genes. CGH experiments on cDNA microanays were done as dcscribcil (14. 
1 5). Briefly, 20 fig of genomic DNA fifom breast cancer cell lines and normal 
human WBCs were digested for 14-18 h with Aiul and Rsal (Life Technol- 
ogies, Inc.. Rockvilie, MD) and purified by phenol/chloroform extraction. Six 
of digested cell line DNAs were labeled with Cy3-dUTP (Amersham 
Phannacia) and normal DNA with Cy5-dUTP (Amersham Pharmacia) using 
the Bioprime Labeling kit (Life Technologies, Inc.). HybridizaUon (14. 15) and 
posthybridizaiion washes (13) were done as described. For the expression 
analyses, a standard reference (Universal Human Reference RNA; Stratagcne. 
La Jolla, CA) was used in all experiments. Forty MS of reference RNA were 
labeled with Cy3-dUTP and 3.5 ftg of test mRNA with Cy5-dUTP, and the 
labeled cDNAs were hybridized on microarrays as described (13, 15). For both 
microarray analyses, a laser confocal scanner (Agilent Technologies, Palo 
Alio, CA) was used to measure the fluorescence tntensiiies at the target 
locations using the DEARRAY software (16). Afler background subtraction, 
average intensities at each done in die test hybridization were divided by the 
average intensity of the conesponding clone in the control hybridization. For 
the copy number analysis, the ratios were normalized on the basis of the 
distribution of ratios of all targete on the array and for the expression analysis 
on the basis of 88 housekeeping genes, which were spotted four ames onto the 
anay. Uw quality measurements {i.e., copy number data with mean reference 
intensity <I00 fiuorescent units, and expression data widi both test and 
reference intensity <I0O fluorescent units and/or with spot size <50 units) 
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were excluded from the analysis and were treated as missing values. The 
distributions of fluorescence ratios were used to define cutpoinis for increased/ 
decreased copy number. Genes with CGH ratio >1.43 (representing the upper 
5% of the CGH ratios across all experiments) were considered to be amplified, 
and genes with ratio <0.73 (representing the lower 5%) were considered to be 
deleted. 

Statistical Analysis of CGH and cDNA Microarray Data. To evaluate 
the influence of copy number alterations on gene expression, we applied the 
following sutistical approach. CGH and cDNA calibrated intensity ratios were 
log-transformed and normalized using median centering of the values in each 
cell line. Furthermore, cDN A ratios for each gene across all 14 cell lines were 
median centered. For each gene, the CGH data were represented by a vector 
that was labeled 1 for amplification (ratio. >1.43) and 0 for no amplification. 
Amplification was correlated widj gene expression using the signal-to-noise 
statistics (1). We calculated a weight, for each gene as follows: 

m^i - mjo 

where m,,. cr^j and (Tgo denote the means and SDs for the expression 
levels for amplified and nonamplificd cell lines, respectively. To assess the 
statistical significance of each weight, we performed 10.000 random permu- 
tations of die label vector. The probability that a gene had a larger or equal 
weight by random permutation tfian d>e original weight was denoted by a. A 
low a (<0.05) indicates a strong association between gene expression and 
amplification. 

Genomic LocaUzation of cDNA Oones and Amplicon Mapping. Each 
cDNA clone on the microarray was assigned to a Unigenc cluster using the 
Unigcne BuUd 141 .* A database of genomic sequence alignment infonnation 
for mRNA sequences was created ftom the August 2001 fircezc of the Uni- 
versity of California Santa Cmz*s GoldenPadi database.' The chromosome and 
bp positions for each cDNA clone were then retrieved by relating djcse data 
sets Amplicons were defined as a CGH copy number ratio >2.0 in at least two 
adjacent clones in two or more cell lines or a CGH ratio >2.0 in at least three 
adjacent clones in a single cell line. The amplicon start and end positions were 



* Interact address: bttpy/researdj.nhgrLaib.80Y/riucro3iny/downloadabte^^ 
' Internet address: www.gcnome.ucsc.cdij. 
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Table I Summary of independent ampHcons in 
CGH microamty 



14 breast cancer celt tines by 



Location 



lpl3 
Iq2t 
lq22 
3pl4 

7pl2.1-7pM.2 

7q3l 

7q32 

8q2Kll-8q2l,13 
8q2IJ 

8q233-«q24J4 

Sq24:22 

9pl3 

13q22--q3l 
!6q22 

nqii 

17ql2-q21.2 

nq2U2-<i2l.33 

l7((22-q23.3 

nq23.:M24.3 

19q13 

20qll.22 

20ql3.12 

20ql3.12-ql3.13 

20ql3^-ql3.32 



Start (Mb) 



End (Mb) 



132.79 
173,92 
179^8 
71.94 
55.62 
125.73 
140.01 
86.45 
98.45 
129.88 
151.21 
38.65 
77.15 
86.70 
29.30 
39.79 
52.47 
63.81 
69.93 
40,63 
34.59 
44.00 
46.45 
51J2 



132.94 
177^5 
179.57 
74.66 
60.95 
130.96 
140.68 
92.46 
103.05 
142.15 
152.16 
39.25 
8U8 
87.62 
30.85 
42.80 
55.80 
69.70 
74.99 
41.40 
35.85 
45.62 
49.43 
59.12 



Size (Mb) 



0.2 

3.3 

0,3 

2.7 

5.3 

5.2 

0.7 
6.0 
4,6 

12.3 
1.0 
0.6 
4,2 
0.9 
1.6 
3,0 
3J 
5.9 
5-1 
0.8 
1.3 
1.6 
3.0 
7.8 



CGH were validated, with lq2l. 17ql2-q21.2, 17q22-q23. 20ql3,l, 
and 20ql3.2 regions being most commonly amplified. Furthermore, 
the boundaries of these amplicons were precisely delineated. In ad- 
dition, novel amplicons were identified at 9pl3 (38.65-39.25 Mb), 
and 17q21.3 (52.47-55.80 Mb). 

Direct Identification of Putative Amplification Target Genes. 
The cDNA/CGH microarray technique enables the direct correla- 
tion of copy number and expression data on a gene-by-gene basis 
throughout the genome. We directly annotated high-resolution 
CGH plots with gene expression data using color coding. Fig. 2C 
shows that most of the amplified genes in the MCF-7 breast cancer 
cell line at lpl3, 17q22-q23, and 20ql3 were highly overex- 
pressed. A view of chromosome 7 in the MDA-468 cell hne 
implicates EGFR as the most highly overexpressed and amplified 
gene at 7pl l-pl2 (Fig. 3^). In BT-474. the two known amphcons 
at 17ql2 and nq22-q23 contained numerous highly overex- 
pressed genes (Fig. 3B). In addition, several genes, including the 
homeobox genes H0XB2 and H0XB7, were highly amplified m a 
previously undescribed independent amplicon at 17q21.3. HOXB7 
was systematically amplified (as validated by FISH. Fig. ^B. inset) 
as well as overexpressed (as verified by KT-PCR, data not shown) 
in BT-474, UACC812, and ZR-75-30 cells. Furthermore, this novel 



extended to include neighboring nonamplificd clones (ratio, <1.5), The am- 
plicon size dctcmunalion was partially dependent on local clone density. 

FISH. Dual-color interphase FISH to breast cancer cell lines was done as 
described (17). Bacterial artificial chromosome clone RPn-361K8 was la- 
beled with SpectmmOrange (Vysis. Downers Grove, IL), and Spccirum- 
Orangc-labcled probe for EGFR was obtained from Vysis. SpectrumGreen- 
labelcd chromosome 7 and 17 centromere probes (Vysis) were used as a 
reference. A tissue microarray containing 612 formalin-fixed, paraffm-embed- 
ded primary breast cancers (17) was applied in FISH analyses as described 
(18). The use of these specimens was approved by the Ethics Committee of me 
University of Basel and by the NIH. Specimens containing a 2-fold or higher 
increase in the number of test probe signals, as compared with corresponding 
centromere signals, in at least 10% of the mmor cells were considered to be 
amplified. Survival analysis was performed using the Kaplan-Meier method 
and the log-rank test. 

RT-PCIt The H0XB7 expression level was detennined relative to 
GAPDH Reverse transcription and PCR amplification were performed using 
Access RT-PCR System (Promega Corp., Madison. WI) with 10 «g of 
as a template. H0XB7 primers were 5*-GAGCAGAGGGACTCGGACrrT-3 
and 5'-GCGTCAGGTAGCGArrGTAG-3'. 

RESULTS 

Global Effect of Copy Number on Gene Expression. 13.824 
arrayed cDNA clones were applied for analysis of gene expression 
and gene copy number (CGH microairays) in 14 breast cancer cell 
lines. The results illustrate a considerable influence of copy number 
on gene expression patterns. Up to 44% of the highly amphfied 
transcripts (CGH ratio, >2.5) were overexpressed (i.e., belonged to 
the global upper 7% of expression ratios), compared with only 6% for 
genes with normal copy number levels (Fig. 1 A), Conversely. 10.5% 
of the transcripts with high-level expression (cDNA ratio, >10) 
showed increased copy number (Fig. \B). Low-level copy number 
increases and decreases were also associated with similar, although 
less dramatic, outcomes on gene expression (Fig. 1). 

Identification of Distinct Breast Cancer Amplicons. Base-pair 
locations obtained for 1 1,994 cDNAs (86.8%) were used to plot copy 
number changes as a function of genomic position (Fig. 2. Supple- 
ment Fig. A). The average spacing of clones throughout the genome 
was 267 kb. This high-resohition mapping identified 24 mdependent 
breast cancer amplicons. spanning from 0.2 to 12 Mb of DNA (Table 
I). Several amplification sites detected previously by chromosomal 
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amplification was validated to be present in 10.2% of 363 primary 
breast cancers by FISH to a tissue microarray and was associated 
with poor prognosis of the patients (P = 0.001). 

Statistical Identification and Characterization of 270 Highly 
Expressed Genes in Amplicons. Statistical comparison of expres- 
sion levels of all genes as a function of gene amplification identified 
270 genes whose expression was significantly influenced by copy 
number across all 14 cell lines (Fig. 4, Supplemental Fig. B). Accord- 
ing to the gene ontolojgy data,* 91 of the 270 genes represented 
hypothetical proteins or genes with no functional annotation, whereas 
179 had associated fimctional information available. Of these, 151 
(84%) are implicated in apoptosis, cell proliferation, signal transduc- 
tion, and transcription, whereas 28 (16%) had functional annotations 
that could not be directly linked with cancer. 



DISCUSSION 

The importance of recurrent gene and chromosome copy number 
changes in the development and progression of solid tumors has been 
characterized in > 1000 publications applying CGH* (9, 10), as well 
as in a large number of other molecular cytogenetic, cytogenetic, and 
molecular genetic studies. The effects of these somatic genetic 
changes on gene expression levels have remained largely unknown, 
although a few studies have explored gene expression changes occur- 
ring in specific amplicons (15. 19-21). Here, we applied genome- 
wide cDNA microarrays to identify transcripts whose expression 
changes were attributable to underlying gene copy number alterations 
in breast cancer. 

The overall impact of copy number on gene expression patterns was 
substantial with the most dramatic effects seen in the case of high- 



Intemct address: http://www.gcneonlology.or8/. 



' Imcmd address: hiipy/www.Dcbi.nlm.nih.gov/enircz. 
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level copy number increase. Low-level copy number gains and losses 
also had a significant influence on expression levels of genes m the 
regions affected, but these effects were more subUe on a gene-by-gene 
basis Aan those of hi^-level amplifications. However, the impact of 
low-level gains on the dysregulalion of gene expression patterns m 
cancer may be equally important if not more important than that ot 
high-level amplifications. Aneuploidy and low-level gains and losses 
of chromosomal arms represent the most common types of genetic 
alterations in breast and other cancers and, therefore, have an influ- 
ence on many genes. Our results in breast cancer extend the recent 
studies on the impact of aneuploidy on global gene expression pat- 
terns in yeast cells, acute myeloid leukemia, and a ptosute cancer 
model system (22-24). 

The CGH microatiay analysis identified 24 independent breast 
cancer amplicons. We defined the precise boundaries for maiV am- 
plicons detected previously by chromosomal CGH (9. 10. 25, 26) and 
also discovered novel amplicons that had not been detected previ- 
ously, presumably because of their small size (only 1-2 Mb) or close 
proximity to other larger amplicons. One of these novel ainphcons 
involved the homeobox gene region at 17q21.3 and led to the over- 
expression of the and H0XB2 genes. The homeodomain 
transcription factors are known to be key regnlators of embiyomc 
development and have been occasionally reported to undergo aberrant 
expression in cancer (27, 28). H0XB7 transfection induced cell pro- 
liferation in melanoma, breast, and ovarian cancer cells and mcreased 
tumorigenicity and angiogencsis in breast cancer (29-32). The pres- 
ent results imply that gene amplification may be a prominent mech- 
anism for overexpressing H0XB7 in breast cancer and suggest that 
H0XB7 contributes to mmor progression and confers an aggressive 
disease phenotype in breast cancer. This view is supported by our 
finding of amplification of HOXB7 in 10% of 363 primary breast 
cancers, as well as an association of amplification with poor prognosis 

of the patients. , .^^ 

We carried out a systematic search to identify genes whose 
expression levels across all 14 cell lines were attributable to 
amplification status. Statistical analysis revealed 270 such genes 
(representing -2% of all genes on the array), including not only 
previously described amplified genes, such as HER-2, M/C, 
EGFR. ribosomal protein s6 kinase, and AJB3, but also numerous 
novel genes such as NRAS-relaled gene (lpl3). syndecan-2 (8q22). 
and bone morphogenic protein (20ql3.1). whose activation by 
amplification may similarly promote breast cancer progression. 
Most of the 270 genes have not been implicated previously in 
breast cancer development and suggest novel pathogenetic mech- 
anisms. Although we would not expect all of them to be causally 
involved, it is intriguing that 84% of the genes with associated 
functional information were implicated in apoptosis. cell prolifer- 
ation, signal transduction, transcription, or other cellular processes 
that could directiy imply a possible role in cancer progression. 
Therefore, a detailed characterization of these genes may provide 
biological insights to breast cancer progression and might lead to 
the development of novel therapeutic strategies. 

In summary, we demonstrate application of cDNA microarrays 
to the analysis of botii copy number and expression levels of over 
12 000 transcripts throughout the breast cancer genome, roughly 
once every 267 kb. This analysis provided: (a) evidence of a 
= prominent global influence of copy number changes on gene 
expression levels; (6) a high-resolution map of 24 independent 
amplicons in breast cancer; and (c) identification of a set of 270 
genes the overexpression of which was sutistically attnbuUble to 
gene 'amplification. Characterization of a novel amphcon at 
I7q21.3 implicated amplification and overexpression of the 
H0XB7 gene in breast cancer, including a clinical association 



between H0XB7 amplification and poor patient prognosis. Overall, 
our results illustrate how the identification of genes activated by 
gene amplification provides a powerful approach to highlight 
genes with an important role in cancer as well as to prioritize and 
validate putative targets for therapy development. 
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Contributed by Patrick O. Brown. August 6, 2002 
Genomic DNA copy number alterations are key genetic events in 
the development and progression of human cancers. Here we 
report a genome-wide microarray comparative genomic hybrid- 
ization (array CGH) analysis of DNA copy number variation in 
a series of primary human breast tumors. We have profiled DNA 
copy number alteration across 6,691 mapped human genes, m 44 
predominantly advanced, primary breast tumors and 10 breast 
cancer cell lines. While the overall patterns of DNA amplification 
and deletion corroborate previous cytogenetic studies, the high- 
resolution (gene-by-gene) mapping of amplicon boundanes and 
the quantitative analysis of amplicon shape provide significant 
improvement in the localization of candidate oncogenes. Parallel 
microarray measurements of mRNA levels reveal the remarkable 
degree to which variation in gene copy number contributes to 
variation in gene expression in tumor cells. Specifically, we find 
that 62% of highly amplified genes show moderately or highly 
elevated expression, that DNA copy number influences gene ex- 
pression across a wide range of DNA copy number alterations 
(deletion, low-, mid- and high-level amplification), that on average, 
a 2-fold change in DNA copy number is associated with a conre- 
sponding 1.5-fold change in mRNA levels, and that overall, at least 
12% of all the variation in gene expression among the breast 
tumors is directly attributable to underlying variation In gene copy 
number. These findings provide evidence that widespread DNA 
copy number alteration can lead directly to global deregulation of 
gene expression, which may contribute to the development or 
progression of cancer. 

Conventional cytogenetic techniques, including comparative 
genomic hybridization (CGH) (1), have led to the identifi- 
cation of a number of recurrent regions of DNA copy tiumbcr 
alteration in breast cancer cell lines and tumors (2-4). While 
some of these regions contain known or candidate oncogenes 
e.g.! FGFRl (Spll). MYC (8q24), CCNDl {Uql3), ERBB2 
(17ql2). and ZNF217 (20ql3)] and tumor suppressor genes 
[RBI (13ql4) and TP53 (17pl3)], the relevant gene(s) within 
other regions (e.g.. gain of Iq, 8q22. and 17q22-24, and loss of 
8p) remain to be identified. A high-resolution genome-wide 
map, delineating the boundaries of DNA copy number alter- 
ations in tumors, should facilitate the localization and identifi- 
cation of oncogenes and tumor suppressor genes m breast 
cancer. In this study, we have created such a map, using 
array-based CGH (5-7) to profUe DNA copy number alteration 
in a series of breast cancer ceU lines and primary tumors. 

An unresolved question is the extent to which the widespread 
DNA copy number changes that we and others have identtfied 
in breast tumors alter expression of genes within involved 
regions Because we had measured mRNA levels in parallel in 
the same samples (8). using the same DNA microarrays. we had 
an opportunity to explore on a genomic scale the relationship 
between DNA copy number changes and gene expression. From 

www4>nas.org/cgi/doi/l0.1073/pna$.162471999 



this analysis, we have identified a significant impact of wide- 
spread DNA copy number alteration on the transcriptional 
programs of breast tumors. 

Materia and Methods 

Tumors and Cell Lines. Primary breast tumors were predommMitly 
large (>3 cm), intermediate-grade, infiltrating ductd carcino- 
ni4 with more than 50% being lymph node POS^^^^«' 
fraction of tumor cells within specimens averaged f 1^\509&. 
Details of indn^idual tumors have been published (8, 9). and 
are summarized in Table 1, which is published as supporting 
information on the PNAS web site, www.pnas.org. Breast cwicer 
cell lines were obtained from the Amencan Type Culture 
Collection. Genomic DNA was isolated either using Qiagen 
genomic DNA columns, or by phenol/chloroform extraction 
followed by ethanol precipitation. 

DNA Ubeling and Microarray Hybridizations. Genomic DNA label- 
ing and hybridizations were performed essentially as described 
in Pollack et aL (7), with slight modifications. Two micrograms 
of DNA was labeled in a total volume of 50 mic«)liters a^^^ 
volumes of all reagents were adjusted accordin^y. J^^^A 
(from tumors and cell lines) was fluorescently labeled (Cy5) 
hybridized to a human cDNA microarray contammg 6^691 
different mapped human genes (ix., UniGene clusters). The 
"reference" (labeled with C^3) for each hybndization was nor- 
mal female leukocyte DNA from a single donor. The fabncation 
of cDNA microarrays and the labeling and hybndizaUon of 
mRNA samples have been described (8). 

Data Analysis and Map Positions. Hybridized arrays were scanned 
on a GenePtt scanner (Axon Instruments, Foster City, CA). and 
fluorescence ratios (test/reference) calculated using scanalyze 
software (available at http://ranaJbI.gov). Fluorescence ratios 
were normalized for each array by setting the average log 
fluorescence ratio for all array elements equal to 0. Measure- 
mcnU with fluorescence intensities more than 20% above back- 
ground were considered reliable. DNA copy number profiles 
5iat deviated significantly from background ratios measured m 
normal genomic DNA control hybridizations were interpreted as 
evidence of real DNA copy number alteration (see Estimatmg 
Sienificance of Altered Fluorescence Ratios in the supportmg 
information). When indicated, DNA copy number profile are 
displayed as a moving average (symmetric 5-nearest neighbor). 
Map positions for arrayed human cDNAs were assigned by 



Abbreviation: CGH. comparative genomic hybridliatton. 
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numbers of X chromosom«, tor breast cancer cell lines, and for breast tunrvors. Each row represents a different cell line or tumor, and each 
onro76.691dHferemma^d human genespresentonthernicroarray.orderedbyg^ 
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fold-amplification, green luminescence reflects fold^eletion. and Wack indicates no change (gray Indicate, poorly measured data), (b) Enlarged view of DNA 
copy number profiles across the X diromosome, shown for cell lines containing different numbers of X chromosomes. 



identifying the starting position of the best and longest match of 
any DNA sequence represented in the corresponding UniGene 
cluster (10) against the "Golden Path" genome assembly 
(http://genome.ucsc,edu/; Oct 7. 2000 Freeze), For UniGene 
clusters represented by multiple arrayed elements, mean fluo- 
rescence ratios (for all elements representing the same UniGene 
cluster) are reported. For mRNA measurements, fluorescence 
ratios arc "mean-centered" (i.e., reported relative to the mean 
ratio across the 44 tumor samples). The data set described here 
can be accessed in its entirety in the supporting information. 

Results 

We performed CGH on 44 predominantly locally advanced, 
primary breast tumors and 10 breast cancer cell lines, using 
cDNA microarrays containing 6,691 different mapped human 
genes (Fig. la; also see Materials and Methods for details of 
microarray hybridizations). To take full advantage of the im- 
proved spatial resolution of array CGH, we ordered (fluores- 
cence ratios for) the 6,691 cDNAs according to the "Golden 
Path" (bttp://gcnome.ucsc.edu/) genome assembly of the draft 
human genome sequences (11). In so doing, arrayed cDNAs not 
only themseWes represent genes of potential interest (e.g., 
candidate oncogenes within amplicons), but also provide precise 
genetic landmarks for chromosomal regions of amplification and 



deletion. Parallel analysis of DNA from cell lines containing 
different numbers of X chromosomes (Fig. lb), as wc did before 
(7), demonstrated the sensitivity of our method to detect single- 
copy loss (45, XO). and 1^ (47,XXX). 2- (4S,XXXX), or 
2.5-fold (49.XXXXX) gains (also sec Fig. 5, which is pubhshed 
as supporting information on the PNAS web site). Fluorescence 
ratios were linearly proportional to copy number ratios, which 
were slightly underestimated, in agreement with previous ob- 
servations (7). Numerous DNA copy number alterations were 
evident in both the breast cancer cell lines and primary tumors 
(Fig. la), detected in the tumors despite the presence of euploid 
non-tumor cell types; the magnitudes of the observed changes 
were generally lower in the tumor samples. DNA copy-number 
alterations were found in every cancer cell line and tumor, and 
on every human chromosome in at least one sample. Recurrent 
regions of DNA copy number gain and loss were readily iden- 
tifiable. For example, gains within Iq. 8q, 17q, and 20q were 
observed in a high proportion of breast cancer cell lines/tumors 
(90%/69%. 100%/47%, 100%/60%, and 90%/44%. respecthrc- 
ly), as were losses within Ip. 3p, 8p, and 13q (80%/24%, 
80%/22%, 80%/22%. and 70%/18%, respective W. consistent 
with published cytogenetic studies (refs. 2-4; a complete listmg 
of gains/losses is provided in Tables 2 and 3, which are published 
as supporting information on the PNAS web site). The total 
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number of genomic alterations (gains and losses) was found to 
be significantly higher in breast tumors that were high grade {P - 
0.008). consistent with published CGH data (3)» estrogen recep- 
tor negative {P - 0.04), and harboring TP53 mutations (P = 
0.0006) (sec Table 4, which is published as supportmg mforma- 
tion on the PNAS web site). . 

The improved spatial resolution of our array CGH analysis is 
illustrated for chromosome 8, which displayed extensive DNA 
copy number alteration in our series. A detailed view of the 
variation in the copy number of 241 genes mapping to chromo- 
some 8 revealed muUiple regions of recurrent amplification; 
each of these potentially harbors a different known or previously 
uncharactcrized oncogene (Fig. 2a). The complexity of amplicon 
structure is most easily appreciated in the breast wnccr wll Ime 
SKBR3. Although a conventional CGH analysis of 8q m SKBR3 
identified only two distinct regions of amplification (12), we 
observed three distinct regions of high-level amplification (la- 
beled 1-3 in Fig. 2b). For each of these regions we can define the 

Pollack et at 



boundaries of the interval recurrently amplified in the tumors we 
examined; in each case, known or plausiT>Ie candidate oncogenes 
can be identified (a description of these regions, as well as the 
recurrently amplified regions on chromosomes 17 and 20. can be 
found in Figs. 6 and 7, which are published as supportmg 
information on the PNAS web site). j „ 

For a subset of breast cancer cell Imes and tumore (4 ano 37. 
respectively), and a subset of arrayed genes (6,095), mRNA 
levels were quantitatively measured in paraUel by usmg cDNA 
microarrays (8). The parallel assessment of mRNA levels is 
uschil in the interpretation of DNA copy number changes. For 
example, the highly amplified genes that are also highly ex- 
pressed are the strongest candidate oncogenes wtthin an amplh 
con. Perhaps more significantly, our parallel analysis of DNA 
copy number changes and mRNA levels provides «s the oppor- 
tunity to assess the ^obal impact of widespread DNA copy 
number alteration on gene expression in tumor cells. 

A strong influence of DNA copy number on gene expression 
is evident in an examination of the pseudocolor representations 
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of DNA copy number and mRNA levels for genes on chromo- 
some 17 (Fig. 3). The overall patterns of gene amplification and 
elevated gene expression arc quite concordant; i.e., a significant 
fraction of highly amplified genes appear to be correspondingly 
highly expressed. The concordance between high-level amplifi- 
cation and increased gene expression is not restricted to chro- 
mosome 17. Genome-wide, of 117 high-level DNA amplifica- 
tions (fluorescence ratios >4. and representing 91 different 
genes), 62% (representing 54 different genes; see Table 5, which 
is published as supporting information on the PNAS web site) 
are found associated with at least moderately elevated mRNA 
levels (mean-centered fluorescence ratios >2), and 42% (rep- 
resenting 36 different genes) are found associated with compa- 
rably highly elevated mRNA levels (mean-centered fluorescence 
ratios >4). 

To determine the extent to which DNA deletion and lower- 
level amplification (in addition to high-level amplification) are 
also associated with corresponding alterations in mRNA levels, 
we performed three separate analyses on the complete data set 
(4 cell lines and 37 tumors, across 6.095 genes). First, we 
determined the average mRNA levels for each of five classes 
of genes, representing DNA deletion, no change, and low-, 
medium-, and high-level amplification (Fig. Aa). For both the 



breast cancer ceU lines and tumors, average mRNA levels 
tracked with DNA copy number across all five classes, in a 
statistically significant fashion (P values for pair-wise Student's 
r tests comparing adjacent classes: cell lines, 4 x 10"*', 1 x 10" . 
5 X 10-^ 1 X 10-=^^; tumors, 1 x 10-« 1 x 10-»^ 5 x lO'^K 
1 X 10**). A linear regression of the average log(DNA copy 
number), for each class, against average log(mRNA level) 
demonstrated that on average, a 2-fold change in DNA copy 
number was accompanied by 1 .4- and 1^-fold changes in mRNA 
level for the breast cancer cell lines and tumors, respectively (Fig. 
4fl, regression line not shown). Second, we characterized the 
distribution of the 6,095 correlations between DNA copy num- 
ber and mRNA level, each across the 37 tumor samples (Fig. 46). 
The distribution of correlations forms a normal-shaped curve, 
but with the peak markedly shifted in the positWe direction from 
zero. This shift is statistically significant, as evidenced in a plot 
of observed vs. expected correlations (Fig. 4c), and reflects a 
pervasWe global influence of DNA copy number alterations on 
gene expression. Notably, the highest correlations between DNA 
copy number and mRNA level (the right taU of the distribution 
in Fig. 46) comprise both amplified and deleted genes (data not 
shown). Third, we used a linear regression model to estimate the 
fraction of all variation measured in mRNA levels among the 37 
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tumors that could be attributed to underlying variation in DNA 
copy number. From this analysis, we estimate that, overall, about 
1% of all of the observed variation in mRNA levels can be 
explained directly by variation in copy number of the altered 
genes (Fig. Ad), Wc can reduce the effects of experimental 
measurement error on this estimate by using only that fraction 
of the data most reliably measured (fluorescence intensity/ 
background >3); using that data, our estimate of the percent 
variation in mRNA levels directly attributed to variation in gene 
copy number increases to 12% (Fig. Ad). This still undoubtedly 
represents a significant underestimate, as the observed variation 
in global gene expression is affected not only by true variation in 
the expression programs of the tumor cells themselves, but also 
by the variable presence of non-tumor cell types within clinical 
samples. 

Discussion 

This genome-wide, array CGH analysis of DNA copy number 
alteration in a series of human breast tumors demonstrates the 
usefulness of defining amplicon boundaries at high resolution 
(gene-by-gene), and quantitatively measuring amplicon shape, to 
assist in locating and identifying candidate oncogenes. By ana- 
lyzing mRNA levels in parallel, we have also discovered that 
changes in DNA copy number have a large, pervasive, direct 
effect on global gene expression patterns in both breast cancer 



cell lines and tumors. Although the DNA microarrays used in our 
analysis may display a bias toward characterized and/or hig^hly 
expressed genes, because we are examining such a large fraction 
of the genome (approximately 20% of all human genes), and 
because, as detailed above, we are likely underestimating the 
contribution of DNA copy number changes to altered gene 
expression, we believe our findings are likely to be gcneralizablc 
(but would nevertheless still be remarkable if only applicable to 
this set of -6,100 genes). 

In budding yeast, aneuploidy has been shown to result in 
chromosome-wide gene expression biases (13). Two recent 
studies have begun to examine the global relationship between 
DNA copy number and gene expression in cancer cells. In 
agreement with our findings, Phillips et al (14) have shown that 
with the acquisition of tumorigenicity in an immortalized pros- 
tate epithelial cell line, new chromosomal gains and losses 
resulted in a statistically significant respective increase and 
decrease in the average expression level of involved genes. In 
contrast, Platzer et al (15) recently reported that in metastatic 
colon tumors only ^4% of genes within amplified regions were 
found more highly (>2-fbld) expressed, when compared with 
normal colonic epithelium. This report differs substantially from 
our finding that 62% of highty amplified genes in breast cancer 
exhibit at least 2-fold increased expression. These contrasting 
findings may reflect methodological differences between the 
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studies. For example, the study of Platzer et al. (15) may have 
systematically under-measured gene expression changes. In this 
regard it is remarkable that onty 14 transcripts of many thousand 
residing within unamplified chromosomal regions were found to 
exhibit at least 4-fold altered expression in metastatic colon 
cancer. Additionally, their reliance on lower-resolution chromo- 
somal CGH may have resulted in poorly delimitmg the bound- 
aries of high-complexity amplicons, effectively overcalling re- 
gions with amplification. Alternatively, the contrasting findings 
for amplified genes may represent real biological differences 
between breast and metastatic colon tumors; resolution of this 
issue will require further studies. 

Our finding that widespread DNA copy number alteration has 
a large, pervasive and direct effect on global gene expression 
patterns in breast cancer has several important implications. 
First, this finding supports a high degree of copy number- 
dependent gene expression in tumors. Second, it suggests that 
most genes are not subject to specific autoregulation or dosage 
compensation. Third, this finding cautions that elevated expres- 
sion of an amplified gene cannot alone be considered strong 
independent evidence of a candidate oncogene's role in tumor- 
igenesis. In our study, fully 62% of highly amplified genes 
demonstrated moderately or highly elevated expression. This 
highlights the importance of high-resolution mapping of ampli- 
con boundaries and shape [to identify the "driving" gene(s) 
within amplicons (16)], on a large number of samples, in addition 
to functional studies. Fourth, this finding suggests that analyzing 
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the genomic distribution of expressed genes, even within existing 
microarray gene expression data sets, may permit the inference 
of DNA copy number aberration, particularly aneuploidy (where 
gene expression can be averaged across large chromosomal 
regions; see Fig. 3 and supporting information). Fifth, this 
finding implies that a substantial portion of the phenotypic 
uniqueness (and by extension, the heterogeneity in clinical 
behavior) among patients* tumors may be traceable to undeiiy- 
ing variation in DNA copy number. Sixth, this finding supports 
a possible role for widespread DNA copy number alteration in 
tumorigenesis (H, 18), beyond the amplification of specific 
oncogenes and deletion of specific tumor suppressor genes. 
Widespread DNA copy number alteration, and the concomitant 
widespread imbalance in gene expression, might disrupt aitical 
stochiometric relationships in cell metabolism and physiology 
(e.g., proteosome, mitotic spindle), possibly promoting further 
chromosomal instability and directly contributing to tumor 
development or progression. Finally, our findings suggest the 
possibility of cancer therapies that exploit specific or global 
imbalances in gene expression in cancer. 
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Each year, over 1 82,000 women in the United States are 
diagnosed with breast cancer, and approximately 45.000 die 
of the disease.^ Incidence appears to be increasing in the 
United States at a rate of roughly 2% per year. The reasons 
for the increase are unclear, but non-genetic risk factors appear 
to play a large role.2 

Five-year survival rates range from approximately 65%- 
85%, depending on demographic group, with a significant 
percentage of women experiencing recurrence of their cancer 
within 10 years of diagnosis. One of the factors most predic- 
tive for recurrence once a diagnosis of breast cancer has been 
made is the number of axillary lymph nodes to which tumor 
has metastasized. Most node-positive women are given adju- 
vant therapy, which increases their survival. However, 20%- 
30% of patients without axillary node involvement also 
develop recurrent disease, and the difficulty lies in how to iden- 
tify this high-risk subset of patients. These patients could 
benefit from increased surveillance, eariy intervention, and 
treatment. 

Prognostic markers currently used in breast cancer recur- 
rence prediction include tumor size, histological grade, steroid 
hormone receptor status, DNA ploidy, proliferative index, and 
cathepsin D status. Expression of growth factor receptors and 
over-expression of the HER-2/neu oncogene have also been 
identified as having value regarding treatment regimen and 
prognosis. 

HER-2/neu (also known as c-erbB2) is an oncogene that 
encodes a transmembrane glycoprotein that is homologous 
to, but distinct from, the epidermal growth factor receptor. 
Numerous studies have indicated that high levels of expres- 
sion of this protein are associated with rapid tumor growth, 
certain forms of therapy resistance, and shorter disease-free 
survival. The gene has been shown to be amplified and/or 
overexpressed in 10%-30% of invasive breast cancers and in 
40%-60% of intraductal breast carcinoma.^ 

There are two distinct FDA-approved methods by which 
HER-2/neu status can be evaluated: immunohistochemistry 
(IHC, HercepTest™) and FISH (fluorescent in situ hybridiza- 
tion, PathVysion™ Kit). Both methods can be performed on 
archived and current specimens. The first method allows visual 
assessment of the amount of HER-2/neu protein present on 
the cell membrane. The latter method allows direct quantifi- 
cation of the level of gene amplification present in the tumor, 
enabling differentiation between low- versus high-amplifica- 
tidh. At least one study has demonstrated a difference in 



recurrence risk in women younger than 40 years of age for 
low- versus high-amplified tumors (54.5% compared to 
85.7%); this is compared to a recurrence rate of 16.7% for 
patients with no HER-2/neu gene amplification.^ HER-2/neu 
status may be particularly important to establish in women with 
small (^1 cm) tumor size. 

The choice of methodology for determination of HER-2/ 
neu status depends in part on the clinical setting. FDA approval 
for the Vysis FISH test was granted based on clinical trials 
involving 1549 node-positive patients. Patients received one 
of three different treatments consisting of different doses of 
cyclophosphamide, Adriamycin, and 5-fluorouracil (CAF). 
The study showed that patients with amplified HER-2/neu 
benefited from treatment with higher doses of adriamycin- 
based therapy, while those with normal HER-2/neu levels did 
not. The study therefore identified a sub-set of women, who 
because they did not benefit from more aggressive treatment, 
did not need to be exposed to the associated side effects. In 
addition, other evidence indicates that HER-2/neu amplifica- 
tion in node-negative patients can be used as an independent 
prognostic indicator for early recurrence, recurrent disease at 
any time and disease-related death.^ Demonstration of HER- 
2/neu gene amplification by FISH has also been shown to be 
of value in predicting response to chemotherapy in stage-2 
breast cancer patients. 

Selection of patients for Herceptin® (Trastuzumab) mono- 
cloTial antibody therapy, however, is based upon demonstra- 
tion of HER-2/neu protein overexpression using HercepTest^M. 
Studies using Herceptin^ in patients with metastatic breast 
cancer show an increase in time to disease progression, 
increased response rate to chemotherapeutic agents and a small 
increase in overall survival rate. The FISH assays have not yet 
been approved for this purpose, and studies looking at response 
to Herceptin® in patients with or without gene amplification 
status determined by FISH are in progress. 

In general, FISH and IHC results correlate well. However, 
subsets of tumors are found which show discordant results; 
i.e., protein overexpression without gene amplification or lack 
of protein overexpression with gene amplification. The clini- 
<^al significance of such results is unclear. Based on the above 
considerations, HER-2/neu testing at SHMCi/PAML will uti- 
lize immunohistochemistry (HercepTest®) as a screen, fol- 
lowed by FISH in IHC-negative cases. Alternatively, either 
method may be ordered individually depending on the clini- 
cal setting or clinician preference. 
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CPT code information 

HER-2/neu via IHC 

88342 (including interpretive report) 

HER-2/neu via FISH 

88271 x2 Molecular cytogenetics, DNA probe, each 
88274 Molecular cytogenetics, interphase in situ hybnd- 

ization, analyze 25-99 cells 
8829 1 Cytogenetics and molecular cytogenetics, interpre- 
tation and report 



Procedural Information 

Immunohistochemistiy is performed using the FDA-approved 
DAKO antibody kit. HerceptcstC. The DAKO kit conu.ns 
reagents required to complete a two-step immunohmo- 
chemical staining procedureforroutinely processed, paraffin- 
embedded specimens. Following incubation with the primary 
rabbit antibody to human HER-2/neu protein, the kit employs 
a ready-to-use dextran-based visualization reagent. This re- 
agent consists of both secondary goat anti-rabbit antibody 
molecules with horseradish peroxidase molecules linked to a 
common dextran polymer backbone, thus eliminatmg the need 
for sequential application of link antibody and peroxidase 
conjugated antibody. Enzymatic conversion of the subse- 
quently added chromogen results in formation of visible 
reaction product at the antigen site. The specimen is then coun- 
terstained; a pathologist using light-microscopy interprets 

results. , . • .u. 

FISH analysis at SHMC/PAML is performed usmg the 
FDA-approved PathVysion™ HER-2/neu DNA probe kit, pro- 
duced by Vysis, Inc. Formalin fixed, paraffin-embedded breast 
tissue is processed using routine histological methods, and then 
slides are treated to allow hybridization of DNA probes to the 
nuclei present in the tissue section. The Pathvysion™ kit con- 
tains two direct-labeled DNA probes, one specific for the 
alphoid repetitive DNA (CEP 17, spectnim orange) present at 
the chromosome 17 centromere and the second for the FffiR- 
2/neu oncogene located at 17ql 1 .2-12 (spectrum green). Etiu- 
meration of the probes allows a ratio of the """"J^^^f <=°P'f 
of chromosome 17 to the number of copies of HER-2/neu to 
be obtained; this enables quantification of low versus high 
amplification levels, and allows an estimate of the percentage 
of cells with HER-2/neu gene amplification. The clinic^ly 
relevant distinction is whether the gene amplification is due 
to increased gene copy number on the two chromosome 17 
homologues normally present or an increase in the number of 
chromosome 17s in the cells. In the majority of cases, ratio 
equivalents less than 2.0 are indicative of a normal/negative 
result, ratios of 2.1 and over indicate that amplification is 
present and to what degree. Interpretation of this data will be 
performed and reported firom the Vysis-certified Cytogenet- 
ics laboratory at SHMC. 
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ABSTRACT Wnt family members are critical to many 
developmental processes, and components of the Wnt signal- 
ing pathway have been linked to tumorigenesis in familial and 
sporadic colon carcinomas. Here we report the identiflcation 
of two genes, WISP-1 and WISP-2, that are up-regulated in the 
mouse mammary epithelial cell line C57MG transformed by 
Wnt-1, but not by Wnt-4. Together with a third related gene, 
WJSP-3, these proteins define a subfamily of the connective 
tissue growth factor family. Two distinct systems demon- 
strated WISP induction to be associated with the expression of 
Wnt-1. These included (0 C57MG cells infected with a Wnt-1 
retroviral vector or expressing Wnt-1 under the control of a 
tetracyline repressible promoter, and («) Wnt-1 transgenic 
mice. The WISP-^l gene was localized to human chromosome 
8q24.1-8q24.3. WISP-1 genomic DNA was amplifled in colon 
cancer cell lines and in human colon tumors and its RNA 
overexpressed (2- to > 30-fold) in 84% of the tumors examined 
compared with patient-matched normal mucosa. WISPS 
mapped to chromosome 6q22-6q23 and also was overex- 
pressed (4- to > 40-fold) in ^% of the colon tumors analyzed. 
In contrast, WISP'2 mapped to human chromosome 20ql2- 
20ql3 and its DNA was amplified, but RNA expression was 
reduced (2- to >30-fold) in 79% of the tumors. These results 
suggest that the WISP genes may be downstream of Wnt-1 
signaling and that aberrant levels of WISP expression in colon 
cancer may play a role in colon tumorigenesis. 



Wnt-1 is a member of an expanding family of cysteine-rich, 
glycosylated signaling proteins that mediate diverse develop- 
mental processes such as the control of cell proliferation, 
adhesion, cell polarity, and the establishment of cell fates (1, 
2). Wnt-1 originally was identified as an oncogene activated by 
the insertion of mouse mammary tumor virus in virus-induced 
mammary adenocarcinomas (3, 4). Although Wnt-1 is not 
expressed in the normal mammary gland, expression of Wnt-1 
in transgenic mice causes mammary tumors (5). 

In mammalian cells, Wnt family members initiate signaling 
by binding to the seven-transmembrane spanning Frizzled 
receptors and recruiting the cytoplasmic protein Dishevelled 
(Dsh) to the cell membrane (1, 2, 6). Dsh then inhibits the 
kinase activity of the normally constitutively active glycogen 
synthase kinase-3^ (GSK-3)3) resulting in an increase in 
i3-catenin levels. Stabilized 0-catenin interacts with the tran- 
scription factor TCF/Lefl, forming a complex that appears in 
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the nucleus and binds TCF/Lefl target DNA elements to 
activate transcription (7, 8). Other experiments suggest that 
the adenomatous polyposis coli (APC) tumor suppressor gene 
also plays an important role in Wnt signaling by regulating 
P-catenin levels (9). APC is phosphorylated by GSK-3/3, binds 
to p-catenin, and facilitates its degradation. Mutations in 
either APC or ^-catenin have been associated with colon 
carcinomas and melanomas, suggesting these mutations con- 
tribute to the development of these types of cancer, implicating 
the Wnt pathway in tumorigenesis (1). 

Although much has been learned about the Wnt signaling 
pathway over the past several years, only a few of the tran- 
scriptionally activated dovrastream components activated by 
Wnt have been characterized. Those that have been described 
cannot account for all of the diverse functions attributed to 
Wnt signaling. Among the candidate Wnt target genes are 
those encoding the nodal-related 3 gene, XnrS, a member of 
the transforming growth factor (TGF)-j3 superfamily, and the 
homeobox genes, engrailed, goosecoid, twin (Xtwn), and siamois 
(2). A recent report also identifies c-myc as a target gene of the 
Wnt signaling pathway (10). 

To identify additional downstream genes in the Wnt signal- 
ing pathway that are relevant to the transformed cell pheno- 
type, we used a PCR-based cDNA subtraction strategy, sup- 
pression subtractive hybridization (SSH) (11), using RNA 
isolated from C57MG mouse mammary epithelial cells and 
C57MG cells stably transformed by a Wnt-1 retrovirus. Over- 
expression of Wnt-1 in this cell line is sufficient to induce a 
partially transformed phenotype, characterized by elongated 
and refractile cells that lose contact inhibition and form a 
multilayered array (12, 13). We reasoned that genes differen- 
tially expressed between these two cell lines might contribute 
to the transformed phenotype. 

In this paper, we describe the cloning and characterization 
of two genes up-regulated in Wnt-1 transformed cells, WISP-l 
and WISP-2, and a third related gene, WISP-3, The WISP genes 
are members of the CCN family of growth factors, which 
includes connective tissue growth factor (CTGF), Cyr61, and 
nov, a family not previously linked to Wnt signaling. 

MATERIALS AND METHODS 

SSH. SSH was performed by using the PCR-Select cDNA 
Subtraction Kit (CLONTECH). Tester double-stranded 

Abbreviations: TGF, transforming growth factor; CTGF, connective 
tissue growth factor; SSH, suppression subtractive hybridization; 
VWC, von Willebrand factor type C module. 
Data deposition: The sequences reported in this paper have been 
deposited in the Genbank database (accession nos. AFl 00777, 
AF100778, AF100779, AF100780, and AF100781). 
tTo whom reprint requests should be addressed, e-mail: diane@gene. 
com. 
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cDlSA was synthesized from 2 /ig of poly(A)+ RNA isolated 
frojtn the C57MG/Wnt-1 cell line and driver cDNA from 2 /ng 
of poly(A)-' RNA from the parent C57MG cells. The sub- 
tracted cDNA Hbrary was subcloned into a pGEM-T vector for 
further analysis. 

cDNA Library Screening. Clones encoding full-length 
mouse WISP'I were isolated by screening a AgtlO mouse 
embryo cDNA library (CLGNTECH) with a 70-bp probe from 
the original partial clone 568 sequence corresponding to amino 
acids 128-169. Clones encoding full-length human WISP-l 
were isojated by screening AgtlO lung and fetal kidney cDNA 
libraries with the same probe at low stringency. Clones en- 
coding full-length mouse and human WISP-2 were isolated by 
screening a C57MG/Wnt-1 or human fetal lung cDNA library 
with a probe corresponding to nucleotides 1463-1512. Full- 
length cDNAs encoding MSP-S were cloned from human 
bone marrow and fetal kidney libraries. 

Expression of Human WISP RNA. PCR amplification of 
first-strand cDNA was performed with human Multiple Tissue 
cDNA panels (CLGNTECH) and 300 /iM of each dNTP at 
94''C for 1 sec, 62''C for 30 sec, 72''C for 1 min, for 22-32 cycles. 
WISP and glyceraldehyde-3-phosphate dehydrogenase primer 
sequences are Available on request. 

In Situ Hybridization. ^^P-labeled sense and antisense ribo- 
probes were transcribed from an 897-bp PCR product corre- 
sponding to nucleotides 601-1440 of mouse WISP-l or a 
294-bp PCR product corresponding to nucleotides 82-375 of 
mouse WISP'2. All tissues were processed as described (40). 

Radiation Hybrid Mapping. Genomic DNA from each 
hybrid in the Stanford G3 and Genebridge4 Radiation Hybrid 
Panels (Research Genetics, Huntsville, AL) and human and 
hamster control DNAs were PCR-amplified, and the results 
were submitted to the Stanford or Massachusetts Institute of 
Technology web servers. 

Cell Lines, Tumors, and Mucosa Specimens. Tissue speci- 
mens were obtained from the Department of Pathology (Uni- 
versity of Pittsburgh) for patients undergoing colon resection 
and from the University of Leeds, United Kingdom. Genomic 
DNA was isolated (Qiagen) from the pooled blood of 10 
normal human donors, surgical specimens, and the following 
. ATCC human cell lines: SW480, COLO 320DM, HT-29, 
WiDr, and SW403 (colon adenocarcinomas), SW620 (lymph 
node metastasis, colon adenocarcinoma), HCT 116 (colon 
carcinoma), SK-CO-1 (colon adenocarcinoma, ascites), and 
HM7 (a variant of ATCC colon adenocarcinoma cell line LS 
174T). DNA concentration was determined by using Hoechst 
dye 33258 intercalation f luorimetry. Total RNA was prepared 
by homogenization in 7 M GuSCN followed by centrifugation 
over CsCl cushions or prepared by using RNAzol. 

Gene Amplification and RNA Expression Analysis. Relative 
gene amplification and RNA expression of WISPs and c-myc in 
the cell lines, colorectal tumors, and normal mucosa were 
determined by quantitative PCR. Gene-specific primers and 
fluorogenic probes (sequences available on request) were 
designed and used to amplify and quantitate the genes. The 
relative gene copy number was derived by using the formula 
2(Act) vvhere ACt represents the difference in amplification 
cycles required to detect the WISP genes in peripheral blood 
lymphocyte DNA compared with colon tumor DNA or colon 
tumor RNA compared with normal mucosal RNA. The 
a-method was used for calculation of the SE of the gene copy 
number or RNA expression level. The W</5/'-specific signal was 
normalized to that of the glyceraldehyde-3-phosphate dehy- 
drogenase housekeeping gene. All TaqMan assay reagents 
were obtained from Perkin-Elmer Applied Biosys terns. 

RESULTS 

Isolation of WISP-I and WISP'2 by SSH. To identify Wnt- 
1 -inducible genes, we used the technique of SSH using the 



mouse mammary epithelial cell line C57MG and C57MG cells 
that stably express Wnt-1 (11). Candidate differentially ex- 
pressed cDNAs (1,384 total) were sequenced. Thirty-nine 
percent of the sequences matched known genes or homo- 
logues, 32% matched expressed sequence tags, and 29% had 
no match. To confirm that the transcript was differentially 
expressed, semiquantitative reverse transcription-PCR and 
Northern analysis were performed by using mRNA from the 
C57MG and C57MG/Wnt-1 cells. 

Two of the cDNAs, WISP-l and WISP'2, were differentially 
expressed, being induced in the C57MG/Wnt-1 cell line, but 
not in the parent C57MG cells or C57MG cells overexpressing 
Wnt-4 (Fig. lA andB). Wnt-4, unlike Wnt-1, does not induce 
the morphological transformation of C57MG cells and has no 
effect on jS-catenin levels (13, 14). Expression of WISP-l was 
up-regulated approximately 3-fold in the C57MG/Wnt-1 cell 
line and WISP-2 by approximately 5-fold by both Northern 
analysis and reverse transcription-PCR. 

An independent, but similar, system was used to examine 
WISP expression after Wnt-1 induction. C57MG cells express- 
ing the Wnt-1 gene under the control of a tetracycline- 
repressible promoter produce low amounts of Wnt-1 in the 
repressed state but show a strong induction of Wnt-1 mRNA 
and protein within 24 hr after tetracycline removal (8). The 
levels of Wnt-1 and WISP RNA isolated from these cells at 
various times after tetracycline removal were assessed by 
quantitative PCR. Strong induction of Wnt-1 mRNA was seen 
as early as 10 hr after tetracycline removal. Induction of WISP 
mRNA (2- to 6-fold) was seen at 48 and 72 hr (data not shown). 
These data support our previous observations that show that 
WISP induction is correlated with Wnt-1 expression. Because 
the induction is slow, occurring after approximately 48 hr, the 
induction of WISPs may be an indirect response to Wnt-1 
signaling. 

cDNA clones of human WISP-l were isolated and the 
sequence compared with mouse WISP-l . The cDNA sequences 
of mouse and human WISP-l were 1,766 and 2,830 bp in length, 
respectively, and encode proteins of 367 aa, with predicted 
relative molecular masses of '«40,000 (Mr 40 K). Both have 
hydrophobic N-terminal signal sequences, 38 conserved cys- 
teine residues, and four potential N-linked glycosylation sites 
and are 84% identical (Fig. 24). 

Full-length cDNA clones of mouse and human WISP-2 were 
1,734 and 1,293 bp in length, respectively, and encode proteins 
of 251 and 250 aa, respectively, with predicted relative molec- 
ular masses of '«27,000 (Afr 27 K) (Fig. 25). Mouse and human 
WlSP-2 are 73% identical. Human WISP-2 has no potential 
N-linked glycosylation sites, and mouse WISP-2 has one at 
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Fig. 1 . WTSP-J and mSP-2 are induced by Wnt-1. but not Wnt-4, 
expression in C57MG cells. Northern analysis of WISP-l (A) and 
mSP-2 (B) expression in C57MG, C57MG/Wnt-1, and C57MG/ 
Wnt-4 cells. Poly(A)+ RNA (2 ^tg) was subjected to Northern blot 
analysis and hybridized with a 70-bp mouse H75P-7-specific probe 
(amino acids 278-300) or a 190-bp W75P-2-specific probe (nucleotides 
1438-1 627) in the 3' untranslated region. Blots were rehybridized with 
human 0-actin probe. 
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Fig. 2. Encoded amino acid sequence alignment of mouse and 
human WISP-l (A) and mouse and human WISP-2 (B). The potential 
signal sequence, insulin-like growth factor-binding protein (IGF-BP), 
VWC, thrombospondin (TSP), and C-terminal (CT) domains are 
underlined. 

position 197. WISP-2 has 28 cysteine residues that are con- 
served among the 38 cysteines found in WISP-L 

Identification of WISPS* To search for related proteins, we 
screened expressed sequence tag (EST) databases with the 
WISP-l protein sequence and identified several ESTs as 
potentially related sequences. We identified a homologous 
protein that we have called WISP-3. A full-length human 
WISP'S cDNA of 1,371 bp was isolated corresponding to those 
ESTs that encode a 354-aa protein with a predicted molecular 
mass of 39,293. WISP-3 has two potential N-linked glycosyl- 
ation sites and 36 cysteine residues. An alignment of the three 
human WISP proteins shows that WISP-l and WISP-3 are the 
most similar (42% identity), whereas WISP-2 has 37% identity 
with WISP-l and 32% identity with WISP-3 (Fig. 14). 

WISPs Are Homologous to the CTGF Family of Proteins. 
Human WISP-l, WISP-2, and WISP-3 are novel sequences; 
however, mouse WISP-l is the same as the recently identified 
Elml gene. Elml is expressed in low, but not high, metastatic 
mouse melanoma cells, and suppresses the in vivo growth and 
metastatic potential of K-1735 mouse melanoma cells (15). 
Human and mouse WISP'2 are homologous to the recently 
described rat gene, rCop-1 (16). Significant homology (36- 
44%) was seen to the CCN family of growth factors. This family 
includes three members, CTGF, Cyr61, and the protoonco- 
gene nov, CTGF is a chemotactic and mitogen ic factor for 
fibroblasts that is implicated in wound healing and fibrotic 
disorders and is induced by TGF-^ (17). Cyr61 is an extracel- 
lular matrix signaling molecule that promotes cell adhesion, 
proliferation, migration, angiogenesis, and tumor growth (18, 
19). nov (nephroblastoma overexpressed) is an immediate 
early gene associated with quiescence and found altered in 
Wilms tumors (20). The proteins of the CCN family share 
functional, but not sequence, similarity to Wnt-1. All are 
secreted, cysteine-rich heparin binding glycoproteins that as- 
sociate with the cell surface and extracellular matrix. 

WISP proteins exhibit the modular architecture of the CCN 
family, characterized by four conserved cysteine-rich domains 
(Fig. 3B) (21). The N-terminal domain, which includes the first 
12 cysteine residues, contains a consensus sequence (GCGC- 
CXXC) conserved in most insulin-like growth factor (IGF)- 
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Fig. 3. (A) Encoded amino acid sequence alignment of human 
WISPs. The cysteine residues of WISP-l and WISP-2 that are not 
present in WISP-3 are indicated with a dot. {B) Schematic represen- 
tation of the WISP proteins showing the domain structure and cysteine 
residues (vertical lines). The four cysteine residues in the VWC domain 
that are absent in WISP-3 are indicated with a dot. (C) Expression of 
WISP mRNA in human tissues. PGR was performed on human 
multiple-tissue cDNA panels (CLONTECH) from the indicated adult 
and fetal tissues. 

binding proteins (BP). This sequence is conserved in WISP-2 
and WISP-3, whereas WISP-l has a glutamine in the third 
position instead of a glycine. CTGF recently has been shown 
to specifically bind IGF (22) and a truncated nov protein 
lacking the IGF-BP domain is oncogenic (23). The von Wil- 
lebrand factor type C module (VWC), also found in certain 
collagens and mucins, covers the next 10 cysteine residues, and 
is thought to participate in protein complex formation and 
oligomerization (24). The VWC domain of WISP-3 differs 
from all CCN family members described previously, in that it 
contains only sue of the 10 cysteine residues (Fig. 3 A and B), 
A short variable region follows the VWC domain. The third 
module, the thrombospondin (TSP) domain is involved in 
binding to sulfated glycoconjugates and contains she cysteine 
residues and a conserved WSxCSxxCG motif first identified in 
thrombospondin (25). The C-terminal (CT) module contain- 
ing the remaining 10 cysteines is thought to be involved in 
dimerization and receptor binding (26). The CT domain is 
present in all CCN family members described to date but is 
absent in WISP-2 (Fig. 3 A and 5). The existence of a putative 
signal sequence and the absence of a transmembrane domain 
suggest that WISPs are secreted proteins, an observation 
supported by an analysis of their expression and secretion from 
mammalian cell and baculovirus cultures (data not shown). 

Expression of WISP mRNA in Human Tissues. Tissue- 
specific expression of human WISPs was characterized by PCR 
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analysis on adult and fetal multiple tissue cDNA panels. 
WISP-1 expression was seen in the adult heart, kidney, lung, 
pancreas, placenta, ovary, small intestine, and spleen (Fig. 3C). 
Little or no expression was detected in the brain, liver, skeletal 
muscle, colon, peripheral blood leukocytes, prostate, testis, or 
thymus. WlSP-2 had a more restricted tissue expression and 
was detected in adult skeletal muscle, colon, ovary, and fetal 
lung. Predominant exprtesion of WISPS was seen in adult 
kidney and testis and fetal kidney. Lower levels of WISPS 
expression were detected in placenta, ovary, prostate, and 
small intestine. 

In Situ Localization of WISP-l and WISP-2, Expression of 
WISP-1 and WISP-2 was assessed by in situ hybridization in 
mammary tumors from Wnt-1 transgenic mice. Strong expres- 
sion of WISP-1 was observed in stromal fibroblasts lying within 
the fibrovascular tumor stroma (Fig. 4 A-D). However, low- 
level WISP-1 expression also was observed focally within tumor 
cells (data not shown). No expression was observed in normal 
breast. Like WISP-I, WISP-2 expression also was seen in the 
tumor stroma in breast tumors from Wnt-1 transgenic animals 
(Fig. 4 E-H). However, WISP-2 expression in the stroma was 
in spindle-shaped cells adjacent to capillary vessels, whereas 




Fig. 4. {A, C, £, and G) Representative hematoxylin/eosin-stained 
images from breast tumors in Wnt-1 transgenic mice. The correspond- 
ing dark-field images showing WJSP-l expression are shown in B and 
D, The tumor is a moderately well-differentiated adenocarcinoma 
showing evidence of adenoid cystic change. At low power {A and B), 
expression of WISP-l is seen in the delicate branching fibrovascular 
tumor stroma (arrowhead). At higher magnification, expression is seen 
in the stromal(s) fibroblasts (C and D), and tumor cells are negative. 
Focal expression of WISP-l y however, was observed in tumor cells in 
some areas. Images of WISP'2 expression are shown in E-H, At low 
power (£ and F), expression of WISP-2 is seen in cells lying within the 
fibrovascular tumor stroma. At higher magnification, these cells 
appeared to be adjacent to capillary vessels whereas tumor cells are 
negative (G and H). 
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the predominant cell type expressing WISP-l was the stromal 
fibroblasts. 

Chromosome Localization of the WISP Genes. The chro- 
mosomal location of the human WISP genes was determined 
by radiation hybrid mapping panels. WISP-1 is approximately 
3.48 cR from the meiotic marker AFM259xc5 [logarithm of 
odds (lod) score 16.31] on chromosome 8q24.1 to 8q24.3, in the 
same region as the human locus of the novH family member 
(27) and roughly 4 Mbs distal to c-myc (28). Preliminary fine 
mapping indicates that WISP-1 is located near D8S1712 STS. 
WISP-2 is linked to the marker SHGC-33922 (lod = 1,000) on 
chromosome 20ql2-20ql3.1. Human WISPS mapped to chro- 
mosome 6q22-6q23 and is linked to the marker AFM211ze5 
(lod = 1,000). WISPS is approximately 18 Mbs proximal to 
CTGF and 23 Mbs proximal to the human cellular oncogene 
MYB (27, 29). 

Amplification and Aberrant Expression of WISPs in Human 
Colon Tumors. Amplification of protooncogenes is seen in 
many human tumors and has etiological and prognostic sig- 
nificance. For example, in a variety of tumor types, c-myc 
amplification has been associated with malignant progression 
and poor prognosis (30). Because WISP-1 resides in the same 
general chromosomal location (8q24) as c-myc, we asked 
whether it was a target of gene amplification, and, if so, 
whether this amplification was independent of the c-myc locus. 
Genomic DNA from human colon cancer cell lines was 
assessed by quantitative PGR and Southern blot analysis. (Fig. 
5 A and B). Both methods detected similar degrees of WISP-1 
amplification. Most cell lines showed significant (2- to 4-fold) 
amplification, with the HT-29 and WiDr cell lines demonstrat- 
ing an 8-fold increase. Significantly, the pattern of amplifica- 
tion observed did not correlate with that observed for c-myc, 
indicating that the c-myc gene is not part of the amplicon that 
involves the WISP-1 locus. 

We next examined whether the WISP genes were amplified 
in a panel of 25 primary human colon adenocarcinomas. The 
relative WISP gene copy number in each colon tumor DNA 
was compared with pooled normal DNA from 10 donors by 
quantitative PGR (Fig. 6). The copy number of WISP-I and 
WISP-2 was significantly greater than one, approximately 
2-fold for WISP-1 in about 60% of the tumors and 2- to 4-fold 
for WISP-2 in 92% of the tumors {P < 0.001 for each). The 
copy number for WISP-3 was indistinguishable from one {P - 
0.166). In addition, the copy number of WISP-2 was signifi- 
cantly higher than that of WISP-1 {P < 0.001). 

The levels of WISP transcripts in RNA isolated from 19 
adenocarcinomas and their matched normal mucosa were 




Fig. 5. Amplification of WISP-1 genomic DNA in colon cancer cell 
lines. (A) Amplification in cell line DNA was determined by quanti- 
tative PGR. (B) Southern blots containing genomic DNA (10 iutg) 
digested with EcoRl (WISP-l) or Xbal (c-myc) were hybridized with 
a 100-bp human WISP-1 probe (amino acids 186-219) or a human 
c-myc probe (located at bp 1901-2000). The WISP and myc genes are 
detected in normal human genomic DNA after a longer film exposure. 
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Fig. 6. Genomic amplification of WISP genes in human colon 
tumors. The relative gene copy number of the WISP genes in 25 
adenocarcinomas was assayed by quantitative PCR, by comparing 
DNA from primary human tumors with pooled DNA from 10 healthy 
donors. The data are means ± SEM from one experiment done in 
triplicate. The experiment was repeated at least three times. 

assessed by quantitative PGR (Fig. 7). The level of WlSP-1 
RNA present in tumor tissue varied but was significantly 
increased (2- to >25-fold) in 84% (16/19) of the human colon 
tumors examined compared with normal adjacent mucosa. 
Four of 19 tumors showed greater than 10-fold overexpression. 
In contrast, in 79% (15/19) of the tumors examined, WISP-2 
RNA expression was significantly lower in the tumor than the 
mucosa. Similar to WISP-1, W^^P-J RNA was overexpressed in 
63% (12/19) of the colon tumors compared with the normal 
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Fig. 7. WISP RNA expression in primary human colon tumors 
relative to expression in normal mucosa from the same patient. 
Expression of WISP mRNA in 19 adenocarcinomas was assayed by 
quantitative PGR. TTie Dukes stage of the tumor is listed under the 
sample number. The data are means ± SEM from one experiment 
done in triplicate. The experiment was repeated at least twice. 



mucosa. The amount of overexpression of WISPS ranged from 
4- to >40-fold. 



DISCUSSION 

One approach to understanding the molecular basis of cancer 
is to identify differences in gene expression between cancer 
cells and normal cells. Strategies based on assumptions that 
steady-slate mRNA levels will differ between normal and 
malignant cells have been used to clone differentially ex- 
pressed genes (31). We have used a PCR-based selection 
strategy, SSH, to identify genes selectively expressed in 
C57MG mouse mammary epithelial cells transformed by 
Wnt-1. 

Three of the genes isolated, WlSP-l, WISP-2, and WISPS, 
are members of the CCN family of growth factors, which 
includes CTGF, Cyr61, and nov, a family not previously linked 
to Wnt signaling. 

Two independent experimental systems demonstrated that 
WISP induction was associated with the expression of Wnt-1. 
The first was C57MG cells infected with a Wnt-1 retroviral 
vector or C57MG cells expressing Wnt-1 under the control of 
a tetracyline-repressible promoter, and the second was in 
Wnt-1 transgenic mice, where breast tissue expresses Wnt-1, 
whereas normal breast tissue does not. No WISP RNA expres- 
sion was detected in mammary tumors induced by polyoma 
virus middle T antigen (data not shown). These data suggest 
a link between Wnt-1 and WISP^ in that in these two situations, 
WISP induction was correlated with Wnt-1 expression. 

It is not clear whether the WISP^ are directly or indirectly 
induced by the downstream components of the Wnt-1 signaling 
pathway (i.e., j3-catenin-TCF-l/Lefl). The increased levels of 
WISP RNA were measured in Wnt-l-transformed cells, hours 
or days after Wnt-1 transformation. Thus, WISP expression 
could result from Wnt-1 signaling directly through j3-catenin 
transcription factor regulation or alternatively through Wnt-1 
signaling turning on a transcription factor, which in turn 
regulates WISPs, 

The WISPs define an additional subfamily of the CCN family 
of growth factors. One striking difference observed in the 
protein sequence of WISP-2 is the absence of a CI domain, 
which is present in CTGF, Cyr61, nov, WISP-1, and WISP-3. 
This domain is thought to be involved in receptor binding and 
dimerization. Growth factors, such as TGF-)3, platelet-derived 
growth factor, and nerve growth factor, which contain a cystine 
knot motif exist as dimers (32). It is tempting to speculate that 
WISP-1 and WISP-3 may exist as dimers, whereas WISP-2 
exists as a monomer. If the CT domain is also important for 
receptor binding, WISP-2 may bind its receptor through a 
different region of the molecule than the other CCN family 
members. No specific receptors have been identified for CTGF 
or nov. A recent report has shown that integrin avjSa serves as 
an adhesion receptor for Cyr61 (33), 

The strong expression of WISP-1 and WISP-2 in cells lying 
within the fibrovascular tumor stroma in breast tumors from 
Wnt-1 transgenic animals is consistent with previous obser- 
vations that transcripts for the related CTGF gene are pri- 
marily expressed in the fibrous stroma of mammary tumors 
(34). Epithelial cells are thought to control the proliferation of 
connective tissue stroma in mammary tumors by a cascade of 
growth factor signals similar to that controlling connective 
tissue formation during wound repair. It has been proposed 
that mammary tumor cells or inflammatory cells at the tumor 
interstitial interface secrete TGF-^l, which is the stimulus for 
stromal proliferation (34). TGF-^l is secreted by a large 
percentage of malignant breast tumors and may be one of the 
growth factors that stimulates the production of CTGF and 
WISPs in the stroma. 

It was of interest that WISP-1 and WISP-2 expression was 
observed in the stromal cells that surrounded the tumor cells 



14722 Cell Biology, Medical Sciences: Pennica et al 



Proc. Natl. Acad. Sci. USA 95 (1998) 



(epithelial cells) in the Wnt-1 transgenic mouse sections of 
breast tissue. This finding suggests that paracrine signaling 
could occur in which the stromal cells could supply WISP-1 and 
WISP-2 to regulate tumor cell growth on the WISP extracel- 
lular matrix. Stromal cell-derived factors in the extracellular 
matrix have been postulated to play a role in tumor cell 
migration and proliferation (35). The localization of WISP-1 
and WISP'2 in the stromal cells of breast tumors supports this 
paracrine model. 

An analysis of WISP-l gene amplification and expression in 
human cplon tumors showed a correlation between DNA 
amplification and overexpression, whereas overexpression of 
lVISP-3 RNA was seen in the absence of DNA amplification. 
In contrast, WISP-2 DNA was amplified in the colon tumors, 
but its mRNA expression was significantly reduced in the 
majority of tumors compared with the expression in normal 
colonic mucosa from the same patient. The gene for human 
WISP'2 was localized to chromosome 20ql2-20ql3, at a region 
frequently amplified and associated with poor prognosis in 
node negative breast cancer and many colon cancers, suggest- 
ing the existence of one or more oncogenes at this locus 
(36-38). Because the center of the 20ql3 amplicon has not yet 
been identified, it is possible that the apparent amplification 
observed for WISP-2 may be caused by another gene in this 
amplicon. 

A recent manuscript on rCop-1, the rat orthologue of 
WISP-2, describes the loss of expression of this gene after cell 
transformation, suggesting it may be a negative regulator of 
growth in cell lines (16). Although the mechanism by which 
, WISP-2 RNA expression is down-regulated during malignant 
transformation is unknown, the reduced expression of WISP'2 
in colon tumors and cell lines suggests that it may function as 
a tumor suppressor. These results show that the WISP genes 
are aberrantly expressed in colon cancer and suggest that their 
altered expression may confer selective growth advantage to 
the tumor. 

Members of the Wnt signaling pathway have been impli- 
cated in the pathogenesis of colon cancer, breast cancer, and 
melanoma, including the tumor suppressor gene adenomatous 
polyposis coli and ^-catenin (39). Mutations in specific regions 
/of either gene can cause the stabilization and accumulation of 
cytoplasmic j3-catenin, which presumably contributes to hu- 
man carcinogenesis through the activation of target genes such 
as the WISPs. Although the mechanism by which Wnt-1 
transforms cells and induces tumorigenesis is unknown, the 
identification of WISPs as genes that may be regulated down- 
stream of Wnt-1 in C57MG cells suggests they could be 
important mediators of Wnt-1 transformation. The amplifica- 
tion and altered expression patterns of the WISPs in human 
colon tumors may indicate an important role for these genes 
in tumor development. 
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ABSTRACT The coiisi$tent cytogenetic translocation of 
chronic mydogenoiis leukemia (the Philadelphta diromiospme^ 
Ph^) has been observed hi cells of multiple hematopoietic 
lineage. ThUs translocation creates a diimeric gene composed 
of breakpoint-diister*region (frcrr) sequences from chromospnie 
22 ftased to a portion of the abl oncogene on chromosome 9, The 
resulting gene product (JPliif^ resembles the transforbMftg 
protein of the ilbelson murine leukemia Virus in Its structure 
and tyro^e Idnase activity. P210^^ Is expressed in Ph^ 
positive cdl lines of myeloid lineage and in cUnical specimens 
witi^ myddld predominance. We show here that Epstdn-Barr 
vfar^trwBsformed B-lymphocyte lines that*, retain Ph^ can 
express P210'^r The levd of e]q>ries5ion hi these B<eU lines is 
generally lower and more variable than that observed for 
myeloid Unes. Protefai expression is not related to amplification 
of ttie gene but to variation hi the level of berntbl mRNA 
produc«^ from a ffaofl^e PhMemplati. 



Chronic myielogenous leukemia (CML) is a disease of the 
pluripotent stem cell (1). In greater than 95% of patienU, the 
leukemic cells contain the cytogenetic marker known as the 
Philadelphia chromosome, or Ph^ (2). This reciprocal 
translocation event between the long arms of chromosomes 
9 and 22 has been used as a disease-specific marker for 
diagnosis and evaluation of therapy. Multiple hematopoietic 
lineages, including myeloid and B-iymphoid, contain Ph^ in 
eariy or chronic phase, as well as in the more acute accel- 
erated and blast crisis phases of the disease. 

One molecular consequence of Ph^ is the translocation of 
the chromosomal arm containing the c-abt gene on chromo- 
some 9 into the middle of the breakpoint-cluster region (bar) 
gene on chromosome 22 (3-6). Although the precise 
translocation breakpoints are variable, an RNA-splicing 
mechanism generates a very similar 8-kilobase (kb) mRNA in 
each case (5-9). The hybrid 6cr-a6/ message encodes a 
structurally altered fonn of the abl oncogene product, called 
P2|Qc-<bi (io«i3)^ with an amino-terininal segment derived 
from a portion of the exons of bcr on chromosome 22 and a 
carboxyl-terminal segment derived from a msyor portion of 
the exons of the c-abl gene on chromosome 9. The chimeric 
structure of bcr-abl and the resulting P210^**^' is similar to the 
structure of the Abelson murine leukemia virus gag-abl 
genome and resulting Pieo*"*" transforming gene product. 
Both proteins have very similar tyrosine kinase activities (10, 
11, 14) which can be distinguished by their relative stability 
to denaturing detergents and by their ATP requirements from 
the recently described tyrosine kinase activity of the c-abl 
gene product (15). 
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In copcert with structure modification of the aqiino- 
terminal portion of the abl gene, increased level of expression 
has been implicate^ in activation of c-^abl oncogenic poten- 
tial. Myeloid and erythroid cell lines and tUnical samples 
derived from acute-ph^se CML p^ticints contain about 10- 
fold higher levels of the 8-kb bcr-abl niRNA and P210^ than 
the c-abl mRNA forms (6 and 7 kb) and P145^' gene product 
(5, 8, 9, ll). The higher level of expression oif the chimeric 
bcr-abl message in acute-phase cells is not likely to be solely 
due to the presence of the bcr promoter sequences at the 5' 
end of the gene, sinpe the normal 4.5-kb an^ 6.7-kb bet- 
encoded mf^A species are expressed at an even lower level 
than the normal c-a6/ messages (5, 6).. 

We have analyzed a series of Epstein-Barr vii:vs-immor- 
talized B-lymphpid cell lines deriyed from CML patients iQ6). 
With^uch irt vitro clonal cell lines, we pan evalufitte virhether 
the presence of t^h^ always results in synthesis of die chimeric 
bcr-abl message and protein, and whether the quantitative 
expressioii varies for cells of B-lymphoid lineage as com- 
pared to previously examined mveloid cell lipc^s. Our resists 
show that cell lines that retain Ph^ do express bcr-abt message 
and protein, but that the level is generally lower and more 
yariabld than previously seen for myeloid cell lines. The 
demonstration that the Ph^ chromosonud template can vaiy 
in its level of expression of P210*^ suggests that secondary 
mechanisims, beyond the translpcatioh itself, contribute to 
the regulation of the bcr-abl gene in different cell types or 
subclones that derive from the affected stem cell. 

MATERIALS AND METHODS 

Cells and CeU Labellngs. Epstein-Barr virus-transformed 
B-lymphoid cell liries were established from peripheral blood 
samples of chroiiic- and acute-phase CML patients as report* 
ed (16). The cell lines are designated according to patient 
number, karyotype, and lineage. For example, SK- 
CML7Bt(9,22)-33 refers to CML patient 7, B-lymphoid cell 
line, 9;22 translocation (Ph^), cell line 33; and SK-CML7BN- 
2 refers to B-cell line 2 with a normal karyotype derived from 
the same patient. Repeat karyotype analysis was performed 
to verify the retention of Ph^ just prior to analysis for abl 
protein and RNA. Cells were maintained in RPMI li&40 
tnedium with 20% fetal bovine serum. We have not observed 
any consistent pattern of in vitro growth rate that correlates 
to the stage of disease at the time of transformation with 
Epstein-Barr virus. Cells (1.5 x 10^) were washed twice with 
Dulbecco's modified Eaglets medium lacidng phosphate and 
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supplemented with 5% dialyzed fetal bovine serum. Cells 
were then resuspended in 2 ml of the minimal medium. 
Labeling was started with the addition of ['^PJorthophos- 
phate (I mCi/ml; ICN; 1 Ci = 37 GBq) and continued at 37*C 
for 3^ hr. 

Immunopredpltatlon and Immunoblotting. Immunoprecip- 
itations were carried out as described (10). Cells (1.5 x 10^ 
were washed with phosphate-buffered saline and extracted 
with 3-5 ml of phosphate lysis buffer (1% Triton X-100/0.1 
NaDodSO4/0.5% deoxycholate/10 mM Na2HP04, pH 7.5/ 
100 mM NaQ) with 5 mM EDTA and 5 mM phenyhnethyl- 
sulfohyl fluoride. Extracts were clarified by centrifugation 
and precipitated with normal or rabbit anti-abl sera (anti- 
pEX-2 or anti-pEX-5} (17). The precipitated proteins were 
electrophoresed in a NaDodS04/8% polyacrylamide gel. 
^^P-labeled proteins were detected by autoradiography. 
Alternatively, n^/ proteins were detected by immunoblotting. 
Extracts from unlabeled cells were clariftedt and proteins 
were concentrated by imniunoprecipitation with rabbit anti- 
sera against a^/-encoded proteins [anti-pEX-2 and anti-p^X- 
5 combined (17)] and then fractionated in 8% acrylamide gels. 
The proteins were transferred from the gel to nitrocellulose 
niters, using protease*facilitated transfer (18). The ab!- 
encoded proteins were detected using murine monoclonal 
antibodies as a probe and peroxidase-cot\jugated goat anti- 
mouse second stage antibody (Bio-Rad) for develppment. 
Rabbit antisera and mouse monoclonal antibodies to abl 
proteins were prepared using bacterially expressed regions of 
the v-abl protein as immunogens (17, 19). Anti-pEX-2 anti- 
bodies react with the internal tyrosine kinase domain and 
anti-p£X-5 antibodies react with the carboxyl-termihal seg- 
ment of the ^6/ proteins. 

RNA Analysis. RNA was extracted from 10" cells by the 
NaDodS04/urea/phenol niethod (20). Polyadenylylated 
RNA was purified by oligo(d'n affinity chromatography. 
Samples were electrophoresed in a 1% agarose/formalide- 
hyde gel and transferred to nitrocellulose, abl RNA species 
were detected by hybridization with a nick-translated y-abl 
fragment probe (21). 

DNA Analysis. DNA was prepared from 5 x 10^ cells of 
each cell line and processed for Southern blots with a y-abl 
probe as described (21). 

RESULTS 

Variable Levels of P210^»" Are Detected In Ph*-Positive CcB 
Lines. Ph^-positive and Ph^-negative, EpsteiD-Bair virus- 
transformed B-lymphocyte cell lines derived from the same 
patient were examined for P210^~^^' synthesis by ioununo- 
precipitation of [^^P]orthophosphate-labeled cell extracts 
with anti-abl sera (Fig. 1). TTie normal c-abl protein P145*^" 
was detected at a similar level in multiple Ph^^positive and 
Ph^-negative cell lines. P210^-»*" was only detected in the 
Ph^-positive cell lines because the bcr-abl chimeric gene 
which encodes P210*= **** resides on the Ph^ (4, 5, 11, 13). The 
level of P210«-**»* was about 4- to 5-fold higher than the level 
of P145^-»" in the SK-CML7BI-33 ccU line (Fig. U, +). The 
Ph^-positive erythroid-progenitor cell line K562 (C) showed 
a level of P2l6^-**'' about 10-fold higher than PWS'^'*". 
However, the level of P210*''*" was about one-fifth that of 
P145*='»^'in the Ph^-positive SK-CML16Bt-l ceU line (Fig. IB, 
+). Comparison of different autoradiographic exposures 
roughly indicated that the level of P210^'**'' varies over a 
20-fold range between these Ph^>positive B-cell lines. Anal- 
ysis of four additional Ph^-positive B-cell lines demonstrated 
that the level of P210^****' fell into two general classes; some 
cell Unes had a level of P210^-'**> similar to SK-CML7B1.33 
and others had the low level similar to SK-CML16Bt-l (Table 
1). This differs from previous studies with Ph^-positive 
myeloid cell lines and patient samples derived from acute- 
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Fio. 1. Detection of variable levels of P210^ in Ph^^positive 
B-ceU lines. Producdon of P14S^ and P210^ in ^stetn-Bair 
virus^transfonned B-cell lines derived from a blast-crisis (A) and a 
chronic-phase {B) CML patient was examined by metabolic labeling 
with [^^}orthophospbau and immupoprecif^tation. Ph^-negative 
(-) and Pb^-positive (+) cell Unes derived from each patient were 
analyzed. The Pb^-negative cell line in A,- is SK-CML7BN-2 and in 
B,- is SK-CML16BN-1. The Fb^-positive ceO line in A,^ is 
SK-CML7Bt-33 and in B,^ is SK-CMU6Bt-l. The K562 ceU line, a 
Pb'-positive erythioid progenitor cell line spontaneously derived 
from a blast-crisis patient (33), is represented in C. Cells (1.5 x 10^ 
were metabolically labeled with 2 nnCi of P^]orthophosphate for 3-4 
hr and then were extracted and clarified by cen^ugation. Samites 
were immunoprecipitated with control normal serum (lanes 1), 
anti-pEX-2 Oanes 2). or anti-p£X*5 (lanes 3) and analyzed by 
NaDodS04/8% PAGE followed by autoradiography with an Inten- 
sifying screen (3 days for A and C, 10 days for B), 

phase CML patients, in which P210^'**^ was detected at a 
lO-fold higher level than P145*=^ (rcfs. 10 and 11; Table 1). 
There was no large dLSerence in level of chimeric mRNA and 
P220ic-«bi expressed in four myelotd/erythroid-lineage Pb^- 
positive ceU lines (K562, EM2, £Kf3, CML22, and BV173; 
refs. 9 and 11), despite a 4- to 5-fold amplification of 
a6/-related sequences in the K562 cell line. 

Detection of different levels of P210^ in Fig. 1 could be 
due to decreased phosphorylation of P210^, a lower level 
of FlKf"^ synthesis, or altered stability of the protein. To 
help distinguish among these possibilities, the steady-state 
level of P210^~*^ in the cell lines was assayed by inununo- 
blotting. The results show that SK-CML7BtT33 (Fig. 2A, +) 
had a higher level of P21(f **** than P145, similar to the results 
with metabolic labeling (Fig. 1). We did not detect P21(P*" 
by immunoblotting with 2 x 10^ cells of line SK-CML8Bt-3 
(Fig. 28, +). Reconstruction experiments using dilutions of 
cell extracts showed that we could detect about 5-10% the 
level of P21(r***' expressed in the K562 cell line (data not 
shown). We infer that the steady-state level of P210*^*" in 
SK-CML8Bt-3 is lower than the level in SK-CML7Bt-33 by 
a factor of at least 10. The level of P210^-*'*' detected in these 
assays correlated with the amount of P210^~'^ tyrosine kinase 
activity that could be detected in vitrp (data not shown). 

Different Levels of P210^-**^ Are ReQected in tiie Amount of 
Stable bcr^l mRNA. To identify the basis for detection of 
variable levels of P210*:"*", we examined the production of 
the abl RNA. RNA blot hybridization analysis using a y-abl 
probe (Fig. 3) showed that the normal 6- and 7-kb c-abl 
mRNAs were present at a similar level in Ph^-positive and 
-negative cell lines derived from different patients. However, 
the 8-kb mRNA that encodes P210^'^' was detected at a 
10-fold higher level in SK-CML7Bt-33 (Fig. 3A, +) than in 
SK-CML16Bt-l (B, +), which correlated with the relative 
level of P210^'^ detected in each cell line. Analysis of 
additional cell lines demonstrated that the level of 8-kb RNA 
directly correlated with the level of P2ia'-»*»* fTable 1). The 
variation in level of 8-kb RNA detected in these cell lines was 
not due to loss or gain of Ph^ because cytogenetic analysis 
confirmed the presence of Ph^ in these cell lines (ref. 16 and 
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Table 1. Relative levels of bcr*abl expresstoo in Epstein-Barr 
virus-immortalized B-cell lines and myeloid CML lines 



8-kb 



CeU line* 


CMLphaset 


Ph^ 


P2105 


mRNA^ 




BC 








SK-CMUBN-10 


Chronic 








SK-CML8BN-12 


Chronic 








SK-CML16BN-1 


Chronic 








SK-CMU5BN-1 


Chronic 








SK-CML7BM3 


BC 


+ 


+ ++ 


++ + 


SK-CMUlBt-1 


Acc ] 
Acc 


+ ■ 


+++ 


+++ 


SK-CMUlBt-6 


+ 


+++ 


+++ 


SK-CMUBt-3 


Chronic 


+ 


+ 


± 


SK-CML16BM 


Chronic 


+ 


+ 


+ 


SK-CMU5Bt-2 


Chronic 


+ 


+ 


+ 


K562 


BC 


+ 


+ + +++ 


+++++ 


BV173 


BC 


+ 


++ +++ 


+++ + + 


EM2 


BC 


+ 


++++ + 


+++++ 



*Cell lines derived from CML patients by. transformation with 
Epstein-Barr virus as described Names of cell lines indicate 
patient number and Ph' status: SK-CML7Bt indicates a ceU line 
derived from patient 7 that carries the 9;22 Ph' translocation; N 
indicates a normal karyotype. Myeloid-erythroid cell Imes (K562, 
EM2, and BV173) are described in previous pubtications (9, 11, 22, 
33). 

'Status ofpatient at the time cell line was derived. BC, blast crisis; 
Acc, accelerated phase. 

^Presence (+) or absence (-) of Ph^ as demonstrated by karyotypic 
or Southern blot analysis. 

'P210^ detected as described in legend to Fig. 1. B-cell lines 
derived from blast-crisis and accelerated-phase patients had levels 
of P210 3- to 5-fold higher (+++) than levels of PI45. Chronic- 
phase-derived cell Unes had 1710 levels lower than or just equivalent 
(+) to the level of P145. Myeloid and erythroid lines had levels of 
P210 5- to 10-fold higher than P145 (+++++). 

^Eight-kilobase bcr-abl mRNA detected as described in legend to 
Fig. 2. Symbols: ±, borderline detectable; +++++, level of 8-kb 
mRNA 5- to lO-fold higher than that of the 6- and 7-kb c-aM mRNA 
species; + + + , level of 8-kb mRNA 3- to 5-fold higher than that of 
the 6- and 7-kb species; +, a level approximately equivalent to that 
of the 6- and 7-kb messages. 

data not shown). There was no difference in the copy number 
of a^/-related sequences as judged by Southern blot analysis 
(Fig. 4). Only the K562 cell line control showed an amplifi- 
cation of abl sequences* as previously reported (22, 23). 
These combined data suggest that differential bcr-abl mRNA 
expression from a single gene template is responsible for the 
variable levels of P210^'" detected. This could be mediated 
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Fig. 2. Analysis of steady-state abl protein levels by immuno- 
blotting. CeU extracts prepared from 2 x 10^ cells of lines SK- 
CML7BN-2C4.-), SK-CML7Bt-33 SK-CML8BN-10{5,-), 
and SK-CML8Bt-3 were concentrated by immunoprecip- 

itation with anti-pEX-2 plus anti-pEX-5. Samples were then elecUo- 
phorcsed in a NaDodSO^S^ polyacrylamidc gel and transferred to 
nitroceUulose, using protease-facUiuted transfer (18). abl proteins 
were detected using a mixture of two monoclonal antibodies directed 
against the pEX-2 and pEX-5 a6/-protein fragments produced in 
bacteria (19) as a probe and a peroxidiase-cor\jugated goat anti-mouse 
second-stage antibody (Bio-Rad) for development. 
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Fio. 3. Comparison of abt RNA levels In Fh^-positive and 
-negative B-cell lines. The levels of the normal -6- and 7-kb c-£i6/ 
RNAs and the 8-kb ^cr-ofr/RNA were analyzed by blot hybridization 
using a v-abl probe. RNA was extracted from Ph**negative lines 
SK-CML7BN-2 (A,-) and SK-CML16BN-1 (B.-), from Ph*-po»- 
iUve lines SK-CML6Bt-33 (A.+) and SK-CML16Bt*3 (0,+). and 
from line K562 (C.+) by the NaDodSOi/urea/phenol method (20). 
Polyadenylylated RNA was purified by pligo(dT) afiSnity chroma- 
tography, and 15 fig of each sample was electrophoresed in a 1% 
agarose/formaldehyde gel and then transferred to i&trocellulose. The 
blotted RNAs were hybridized with a nickrtnmslated v-abl fragment 
probe (21) and then autoradiographed for 4 days. 

by factors influencing the transcription rate of the bcr*abl 
gene or the stability of the mRNA. 



DISCUSSION 

Several lines of evidence suggest that formation of Ph^ is not 
' the primary event that affects the steni cell in CML. Patients 
have been identified that present with the clinical picture of 
CML but only later develop Ph^ (1). This observation, 
coupled with studies of (glucose-^phosphate dehy- 
dr6genase)-heterozygous females with CML that demon- 
strate stem-cell clonality by isozyme analysis among cell 

^ i 2 3 4 5 6 7 6 9 .10 11 
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Fio. 4. Southern blot analysis of abl sequences in Ph^-positive 
and -negative B-ceil tines. High molecular weight DNA (15 ^g) was 
digested with restriction endonuclease BamHl^ separated in a 0.8% 
agarose gel, and then transferred to nitrocellulose. The blotted DNA 
fragments were hybridized with a nick-translated, 2.4-kb Bgl II \-ahl 
fragment (1.5 x 10* cpm/>ig; ref. 21) and exposed for 4 days. (A) 
Autoradiogram of a6/-specific fragments in ceU lines HL-60 (lane 1), 
EM2 (lane 2). K362 Oane 3), SK-CML7Bt-33 (lane 4). SK-CML8Bt.3 
(lane 5). SK-CML16BI-1 Oane 6), SK-CMUlBt-6 (lane 7), SK- 
CMU5Bt-2 (lane 8). SK-CML7BN-2 Oane 9), SK-CML8BN-2 (lane 
10). and SK-CML35BN-1 Oane 11). (fi) Ethidium bromide staining of 
agarose get prior to transfer to nitrocellulose, showing the level of 
variation in amount of DNA loaded per lane. 
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populations that lack the Ph^ marker, supports a secondary 
or complementary role for Ph^ in liie progression of the 
disease (24, 25). This chromosome marker is found in 
chronic, accelerated, and blast-crisis phases of the disease. It 
is likely that Ph^ confers some growth advantage, since cells 
with the marker chromosome eventually predominate the 
marrow and peripheral blood even in chronic phase. During 
the phase of blast crisis, many patients develop additional 
chromosome ^^normalities, including duplication of PhS a 
variety of trisomies, and complex translocations (26). This 
is suggestive evidence for Ph^ being a necessary but not 
sufiGcient genetic change for the full evolution of the 
disease. 

The realization that one molecular result of Ph^ is the 
generation of a chimeric bcr-abl protein with functional 
characteristics and structure analogous to the gagnibl trans- 
forming protein of the Abelson murine leukemia virus 
strengthens the argument for an important role of Ph^ in the 
pathogenesis of CML. Although the Abelson virus is gener- 
ally considered a rapidly transforming retrovirus, its effects 
can range from overcoming growth factor requirements, to 
cellular lethality, to induction of highly oncogenic tumors in 
a number of hematopoietic cell lineages (27, 28). Even in the 
transformation of murine cell targets, there are several lines 
of evidence that suggest that the growth-promoting activity of 
the v*ahl gene product is complemented by further cellular 
changes in the production of the malignant-cell phenotype 
(29-31). 

The regulation of bcr-abl gene expression is con^>lex 
because the 5' end of the gene is derived from the non-abl 
sequences, bcr, normally found on chromosome 22 (6). The 
level of stable message for the normal bcr gene and the 
normal abl gene are both much lower than the level of the 
bcr^abl message and protein from cell lines and clinical 
specimens derived from myeloid blast-crisis patients (5, 6, 
11). Therefore, the high level of bcr-abl expression cannot 
simply be attributed to the regulatory sequences associated 
with bcr. Possibly, creation of the chimeric gene disrupts the 
normal regulatory sequences and results in a higher level of 
expression. Variation in bcr-abl expression may result from 
secondary changes in the structure of the chimeric gene or 
function of /ra/u-acting factors that occur during evolution of 
the disease. Our analysis of P210^ ***' and the 8-kb mRNA in 
Epstein-Barr virus-transformed Ph^-positive B-cell lines 
demonstrates that stable message and protein levels from the 
bcr-abl gene can vary over a wide range. This variation does 
not result from a change in the number of bcr-abl templates 
secondary to gene amplification but more likely from changes 
in either transcription rate or mRNA stability. We suspect 
this range of bcr-abl expression is not limited to lymphoid 
cells. Analysis of peripheral blood leukocytes derived from 
an unusual CML patient who has been in chronic phase with 
myeloid predominance for 16 years showed a level of 
P210='*" one-fiflh that of P145*^-*", as detected by metabolic 
labeling with [^^P]orthophosphate and immunoprecipitation 
(S.C., O.N.W., and P. Greenberg, unpublished observa- 
tions). Lower levels of expression of the chimeric mRNA 
have been demonstrated in clinical samples from chronic- 
phase CML patients compared to acute-phase CML patients 
(9). Others have reported chronic-phase patients with vari- 
able but, in some cases, relatively high levels of the bcr-abl 
mRNA (32). The sampling variation and the heterogenous 
mixture of cell types in clinical samples complicate such 
analyses. Further work is needed to evaluate whether there 
is a defmed change in P210^~^^* expression during the pro- 
gression of CML. It is interesting to note that among the 
limited sample of Ph^-positive B-cell lines we have examined 
(Table 1), we have seen higher levels of P210^-^* in those 
derived from patic v. at more advanced stages of the disease. 
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It will be important to search for cell-type-specific mecha- 
nisms that might regulate expression of bcr-abl from Fli^. 
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Fkui A. Haynes PToteome analysis: Biological assay or data ardiive? 

Steven p. Gygi ^ *^ ' 

o*"h*'a^k^^ In this review we examine the current state of proteome analysis. There are 

Rnedi Aebersold ^^ixtQ main issues discussed: why it is necessary to study proteomes; how pro- 

f M 1 I teomes can be analyzed with current tedinology; and how proteome analysis 

Department « can be used to enhance biological researdi. We conclude that proteome anal- 

BlotectinoIoKy, ^^^^^ ®[ ysis is an essential tool in the understanding of regulated biological systems. 

Washington, Seattle, WA, UbA Current technology, while stiU mostiy limited to the more abundant proteins. 

enables the use of proteome analysis both to establish databases of proteins 
present, and to perform biological assays involving measurement of multiple 
variables. We believe that the utility of proteome analysis in future biological 
researdi will continue to be enhanced by further improvements in analytical 
tedmology. 

Contents resolution two-dimensional gel electrophoresis (2*D£), 

- - . . -jj,^ detected in the gel and identified by their ammo acid 

1 inttoaucuon ■ • ; jow sequence. The ease, sensitivity and speed with whidi gel- 

2 F^tipna^^. for pmteome. a^^^ 1862 separated proteins <;an be identified b7the use of recen^^ 

2.1 Conelation between mRNA and protem developed mass spectrometric techniques have dramati- 
expression levels . . . . . ... laoi inaeased the interest in proteome technology. One 

2.2 Protems are dynamically modified and pro- of the most attractive features of such analyses is that com- 
cessea luoj j biological systems can potentially be studied in their 

2.3 Proteomes are dynamic and reflect the entirety. rTttier toan as a multitude of individual comp^ 
stote of a biological system 1863 ^^^^ ^.^ ^^.^^ ^^^^^^ 

3 Descnption and assessment of current pro- ^ ^^^^ relationships between mature 
teome analysis technoloK. .. 1863 Jene products in cells. Large-scale proteome characteriza- 

3.1 Technical requirements of proteome tech- f.^^ ^^^^^^ ^^^^ ^^^^ undertaken for a number of dif- 

3.2 2retect;ophor^^^^^ f^T^ organisms and ceU types Microbial proteo^^^^ 
J.X cicuiiwjiiiuiww a|/5.^.wuuioujr. a currently m progress include, for example: Saccharo- 

common miplementation of proteome anal- '^^^ ^^^^^.^.^^ j^j^ Salmonella enterica [31 Spiroplasma 

3.3 Protein' identm^^^^^ it^'XS K.'.'^^^^^^ 

i.ro ir iOr<:/iur<! anrf rv M<!/Kif« !«#:<: anthropi (6J, Haemophilus influenzae [7J. Siynedto- 

111 CE-MS/MS 1865 jgj^ Escherichia coli [9J, /^g«m/i.o- 

lii SS^-Ms':::::::::::::::::::::;::: i6 uoj and /)/c^../ei/«m 

3 3 3 CE MS/MS 1865 P'^cts underway for Ussu^s of more complex organ- 

3.4 Assessment of i-DElMrpro'teome include those for: human bladder squamous ceU 

olo .piuiwuic iwAi carcinomas [12], human bver {13], human plasma [131, 

A ft JrP f * V ' " r" "u" r ' ' ' *i" human keratinot^es [12], human fibroblasts (12], mouse 

4 Utility of proteome analysis for biological ^^^^^ ^ ^j^j ^ n«„^^i cri- 

A i ^r^,«wm."lc jo»lK«V-' i«^s ^""Hy assess the concept of proteome analysis and the 
1*J ?l! Snm! « » SfSr;;«; technical feasibility of estobUshmg complete proteome 
< „<f.^^! Im^ri^ ^ ' • " 5S5 maps, and discuss ways in which proteome anSysis and 

6 Sfe«t^"^.::::::::::::::::::::: S woiogicai resear* intersect. 



1 Introduction 

A proteome has been defined as the protein complement 
expressed by the genome of an organisni, or, in multicel- 
lular organisms, as the protein complement expressed by a 
tissue or differentiated cell [1]. In the most common im- 
plementation of proteome analysis the proteins extracted 
from the cell or tissue analyzed are separated by high 

Conespondeice: Professor Ruedi Aebersold, Department of Molecular 
Biotechnology; University of Washington, Box 3S7730, Seattle, WA, 
98195, USA Obi: +206^S5-4235; Fax: +206-685-6392; E-maU: ruedi 
®u.washington.edu) 

Abbrerlatioos: CID, collision-induced dissociation; MS/MS, tandem- 
niass. spectrometry; SAGE, serial analysis of gene expression 

Keywords: Proteome / IVo-dimensional polyacrylamide gel electro- 
phoresis / Tandem mass spectrometry 



2 Rationale for proteome analysis 

The dramatic growth in both the number of genome 
projects and the speed with which genome sequences 
are being determined has generated huge amounts of 
sequence information, for some species even complete 
genomic sequences ([15— 17D. The description of the 
state of a biological system by the quantitative measure- 
ment of system components has long been a primary 
objective in molecular biology. With recent technical 
advances including the development of dififerential dis- 
play-PGR [18], cDNA microarray and DNA chip tedmo- 
logy [19, 20] and serial analysis of gene expression 
(SAQE) [21, 22], it is now feasible to establish global and 
quantitative mRNA expression maps of cells and tissues, 
in which the sequence of all the genes is Imown, at a 
speed and: sensitivity which is not matched by current 
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protein analysis technology. Given the long-standing 
paradigm in biology that DNA synthesizes RNA which 
synthesizes protein, and the ability to rapidly establish 
global. quantitaUve mRNA expression maps, the ques- 
tions which arise are why tedmically complex proteome 
projects should be undertaken and what specific types of 
information could be expected from proteome projects 
which cannot be obtained from genomic and transcript 
profiling projects. We see three main reasons for pro- 
teome analysis to become an essential component in the 
comprehensive analysis of biological systems, (i) Protein 
expression levels are not predictable from the mRNA 
expression levels, (ii) proteins are dynamically modified 
and processed in ways which are not necessarily 
apparent from the gene sequence, and (iii) proteomes 
are dynamic and reflect the state of a biological system. 

2*1 Correlation between mRNA and protein expression 
levels 

Interpretations of quantitative mRNA expression profiles 
frequently implicitly or explicitly assume that for specific 
genes the transcript levels are indicative of the levels of 
protein expression. As part of an ongoing study in our 
laboratory, we have determined the correlation of expres- 
sion at the mRNA and protein levels for a population of 
selected genes in the yeast Saccharomyces cerevisiae 
growing at mid-log phase (S. R Gygi et aL^ submitted for 
publication). mRNA expression levels were calculated 
from published SAGE frequency tables [22]. Protein 
expression levels were quantified by metabolic radiola- 
beling of the yeast proteins, liquid scintillation counting 
of the protein spots separated by high resolution 2-DE 
and mass spectrometric identification of the protein(s) 
migrating to each spot. The selected 80 samples consti- 
tute a relatively homogeneous group with respect to pre- 
dicted half-life and expression level of the protein pro- 
ducts. Thus far, we have found a general trend but no 
strong coirelation between protein and transcript levels 
(Fig. 1). For some genes studied equivalent mRNA trans- 
cript levels translated into protein abundances whidi 
varied by more than 50-fold. Similarly, equivalent steady* 
state protein expression levels were maintained by trans- 
cript levels varying by as much as 40-fold (S. P. Gygi 
et a/., submitted). These results suggests that even for a 
population of genes predicted to be relatively homoge- 
neous with respect to protein half-life and gene expres- 
sion, the protein levels cannot be accurately predicted 
f^om the level of the corresponding mRNA transcript. 



2,2 Proteins are dynamically modified and processed 

hi the mature, biologically active form many proteins are 
post-translationally modified by glycosyiation, phosphor? 
ylation, prenylation, acylation, ubiquitination or one or 
more of many other modifications [23] and many pro- 
teins are only functional if specifically associated or com- 
plexed with other molecules, includirig DNA, RNA, pro- 
teins and organic and inorganic cofactors. Frequently, 
modifications are dynamic and reversible and may ^ter 
the precise three-dimensional structure and the state of 
activity of a protein. Collectively, the state of modifica- 
tion of the proteins which constitute a biological system 
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Figure I. Correlatioa between mRNA and protein levels in yeast cells. 
For a selected population of 80 genes, protein levels were measured 
by '^-S-radiolabeltng and mRNA levels were calculated from publi- 
shed SAGE tables. Inset: expanded view of the low abundance region. 
Vox more experimental details, also see Pigs. 5 and 6, (S. P. Gygi er a/., 
submitted). 



are important indicators for the state of the system. The 
type of protein modification and the sites modified at a 
specific cellular state can usually not be determined 
from the gene sequence alone. 

23 Proteomes are dynamic and reflect the state of a 
biological system 

A single genome can give rise to many qualitatively and 
quantitatively 'different proteomes. Specific stages of the 
cell cycle and states of differentiation, responses to 
growth and nutrient conditions, temperature and stress, 
and pathological conditions represent cellular states 
which are diaracterized by significantly different pro- 
teomes. The proteome, in principle, also reflects events 
that are under translational and post-translationai con- . 
troi. It is therefore expected that proteomics will be able 
to provide the most precise and detailed molecular des- 
cription of the state of a cell or tissue, provided that the 
external conditions defining the state are carefully deter- 
mined. In answer to the question of whether the study 
of proteomes is necessary for the analysis of biomoleo- 
ular systems, it is evident that the analysis of mature pro- 
tein products in cells is essential as there are numerous 
levels of control of protein synthesis, degradation,, 
processing and modification, which are only apparent by 
direct protein analysis. 



3 Description and assessment of current proteome 
analysis technology 

3.1 Technical reqairemenis of proteome tedinology 

In biological systems the level of expression as well as 
the states of modification, processing and macro-molec- 
ular association of proteins are controlled and modu- 
lated depending on the state of the system. Comprehen- 
sive analysis of the identity, quantity and state of modifi- 
cation of proteins therefore requires the detection and 
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quantitation of the proteins which constitute the system, 
and analysis of differentially processed forms. There are 
a number of inherent difficulties in protein analysis 
whidi complicate these tasks. First, proteins cannot be 
amplified. It is possible to produce large amounts of a 
particular protein by over-expression in specific cell sys- 
tems. However, since many proteins are dynamically 
post-translationally modified, they cannot be easily am- 
plified in the form in which they finally function in the 
biological system. It is frequently difficult to purify from 
the native source sufHdent amounts of a protein for 
analysis. From a tedinoiogical point of view this trans- 
lates into the need for high sensitivity analytical tech- 
niques. Second, many proteins are modified add pro- 
cessed post-transtationalfy. Therefore, in addition to the 
protein identity, the structural basis for differentially 
modified isoforms also needs to be determined. The dis- 
tribution of a constant amount of protein over several 
differentially modified isoforms farther reduces the 
amount of each species, available for analysis. The com- 
plexity and dynamics of post-translational protein edit- 
ing thus significantly complicates proteome studies. 
Third, proteins vary dramatically with respect to their 
solubiUty in commonly used solvents. There are few, if 
any, solvent conditions in whidi all proteins are soluble 
and which are also compatible with protein analysis. This 
makes the developinent of protein purification methods 
particularly difficult since both protein purification and 
solubility have to be achieved under the same condi- 
tions. Detergents, in particular sodium dodecyl sulfate 
(SDS), are frequently added to aqueous solvents to 
maintain protein solubility. The compatibility with SDS 
is a big advantage of SDS polyacrylamide gel electro- 
phoresis (SDS-FAGE) over other protein separation 
techniques. Thus, SDS-PAGE and two-dimensional gel 
electrophoresis, which also uses SDS and other deter- 
gents, are the most general and preferred methods for 
the purification of small amounts of proteins, provided 
that activity does not necessarily need to be maintained. 
Lastly, the number of proteins in a given cell system is 
typically m the thousands. Any attempt to identify and 
categorize all of these must use methods which are as 
rapid as possible to allow completion of the project 
within a reasonable time frame. Therefore, a successful, 
general proteomics technology requires high sensitivity, 
high throughput, the ability to differentiate differentially 
modified proteins, and the ability to quantitatively dis- 
play and analyze all the proteins present in a sample. 

3^ 2-D electrophoresis — mass spectrometry: a common 
implementation of proteome analysis 

The most common currently used implementation of 
proteome analysis technology is based on the separation 
of proteins by two-dimensional (lEF/SDS-PAGE) gel 
electrophoresis and their subsequent identification and 
analysis by mass spectrometry (MS) or tandem mass 
spectrometry (MS/MS). In 2-DE, proteins are first separ- 
ated by isoelectric focusing (lEP) and then by SDS- 
PAOE, in the second, perpendicular dimension. Separ- 
ated proteins are visualized at high sensitivity by staining 
or autoradiography, producing two-dimensional arrays of 
proteins. 2-DE gels are, at present, the most commonly 
used means of global display of proteins in complex 



samples. The separation of thousands of proteins has 
been achieved in a single gel [24, 25] and differentially 
modified proteins are frequency separated. Due to the 
compatibility of 2-DE with high concentrations of deter- 
gents, protein denaturants and other additives promoting 
protein solubility, the tedlmique is widely used. 

The second step of this type of proteome analysts is the 
identification and analysis of separated proteins. Individ- 
ual proteins from polyacrylamide geb have traditionally 
been identified using A/-terminal sequencing [26, 27], 
internal peptide sequencing (28, 29], immunoblotting or 
comigration with known proteins [30]. The recent dra- 
matic growth of large-scale genomic and expressed 
sequence tag (EST) sequence databases has resulted in a 
fundamental diiange in the way proteins are identified by 
their amino acid sequence. Rattier than by the traditional 
methods described above, protein sequences are now fire- 
quently determined by correlating mass spectral or 
tandem mass spectral data of peptides derived from pro- 
teins, with the information contained in sequence data- 
bases (31-331. 

There are a number of alternative approaches to pro- 
teome analysis currently under development. There is 
considerable interest in developing a proteome analysis 
stragegy which bypasses 2-DE altogether, because it is 
considered a relatively slow and tedious process, and 
because of perceived difficulties in extracting proteins, 
from the gel matrix for analysis. However, 2-DE as a 
starting point for proteome analysis has many advan- 
tages compared to other tedmiques available today The 
most significant strengths of the 2-DE-MS approach 
include the relatively uniform behavior of proteins in 
gels, the ability to quantify spots and the high resolution 
and simultaneous display of hundreds to thousands of 
proteins within a reasonable time frame. 

A schematic diagram of a typical procedure of the identi- 
fication of gel-separated proteins is shown in Fig. 2. Pro- 
tein spots detected in the gel are enzymatically or chemi- 
cally fragmented and the peptide fragments are isolated 
for analysis, as already indicated, most frequently by MS 
or MS/MS. There are numerous protocols for the gener- 
ation of peptide fragments firom gel-separated proteins^ 
They can be grouped into two categories, digestion in 
the gel slice [28, 34] or digestion after electrotransfer out 
of the gel onto a suiuble membrane ((29, 35-37] and 
reviewed in [38]). In most instances either tedmique is 
applicable and yields good results. The analysis of MS or 
MS /MS data is an important step in the whole process 
because MS instruments can generate an enormous 
amount of information whidi cannot easily be managed 
manually. Recently, a number of groups have developed 
software systems dedicated to the use of peptide MS 
and MS/MS spectra for the identification of proteins. 
Proteins are identified, by correlating the information 
contained in the MS spectra of protein digests or 
MS/MS spectra of individual peptides with data con- 
tained in DNA or protein sequence databases. 

The systems we are currently using in our laboratory are 
based on the separation of the peptides contained in pro- 
tein digests by narrow bore or capillary liquid diromatog- 
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figure 2, SchemaUc diagram of a procedure for identificaUon of gel- 
separated proteins. Peptides can either be separated by a tedimque 
such as LC or CE, or infused as a mixture and sorted m the MS. Data- 
base searching can either be performed on pcpUdc masses from an 
MS spectrum, peptide fragment masses from CID spectra of peptides, 
or a combination of both. 

raphy 139, 40] or capillary electrophoresis [41], the anal- 
ysis of the separated peptides by electrospray ioniza- 
tion (ESI) MS/MS, and the correlation of the generated 
peptide spectra with sequence databases using the 
SEQUEST program developed at the University of Wash- 
ington [32, 33). The system automatically performs the 
following operations: a particular peptide ion character- 
ized by its mass-to-charge ratio is selected in the MS out 
of all the peptide ions present in the system at a parti- 
cular time; the selected peptide ion is coUided in a colli- 
sion cell with argon (collision-induced dissociation, 
CID) and the masses of the resultmg fragment ions are 
determined in the second sector of the tandem MS; this 
experimentally determined CID spectrum is then corre- 
lated with the CID spectra predicted from all the pep- 
tides in a sequence database whidi have essentially the 
same mass as the peptide selected for CID; this correla- 
tion matdies the isolated peptide with a sequence seg- 
ment in a database and thus identifies the protein from 
which the peptide was derived. There are a number of 
alternative programs ^ch use peptide CID spectra for 
protein identification, but we use the SEQUEST system 
because it is currently the most highly automated pro- 
gram and has proven to be successful, versatile and 
robust 

33 Protehi identiGcation by LC-MS/MS, capiikiy 
LC-MS/MS and CE-MS/MS 

It has been demonstrated repeatedly that MS has a very 
high intrinsic sensitivity. For the routine analysis of gel- 
separated proteins at high sensitivity, the most signif- 
icant Aallenge is the handling of small amounts of 
sample. The crux of the problem is the extraction and 
transferal of peptide mixtures generated by the digestion 
of low nanogram amounts of protein, from gels into the 
MS/MS system without significant loss of sample or 
mtroduction of unwanted contaminants. We employ 
three dififerent systems for introducing gel-purified sam- 
ples mto an MS, depending on the level of sensitivity 
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required. As an approximate guideline, for samples con- 
taming tens of picomoles of peptides, LC-MS/MS is 
most appropriate; for samples contdning low picomole 
amounts to high femtomole amounts we use capdlary 
LC-MS/MS; and for samples containing femlomoles or 
less, CE-MS/MS is the method of choice. 



33.1 LC-MS/MS 

The coupling of an MS to an HPLC system using a 
0.5 mm diameter or bigger reverse phaise (RP) coluncm 
has been described in detail [42]. This system has several 
advantages if a large number of samples are to be ana- 
lyzed and all are avaUable in sufficient quantity. The 
LC-MS and database searching program can be run in a 
fully automated mode using an autosampler, thus maxi- 
mizing sample throughput and minimizing the need for 
operator interference. The relatively large column is 
tolerant of high levels of impurities frOm either gel prep- 
aration or sample matrix. LasUy, if configured with a 
flow-splitter and micro-sprayer [40], analyses can be per- 
formed on a smaU fraction of the sample (less than 5%) 
whUe the remainder of the sample is recovered in very 
pure solvents. This latter feature is particularly useful 
when an orthogonal tedmique is also used to analyze 
pepdde fractions, sudi as scintillation of an introduced 
radiolabel, and this data can be correlated with peptides 
identified by CID spectra. 



3 J J Capillaiy LC-MS 

An increase of sensitivity of approximately tenfold can be 
adiieved by using a capillary LC system with a 100 imi ID 
column rather than a 0.5 mm ID column as referred W 
above. Since very low flow rates are required for such 
columns, most reports have used a precolumn flow spUt- 
ting system for producmg solvent gradients. We have 
recently desribed the design and construction of a novel 
gradient mixing system which enables the formation 
of reprciducible gradiente at very low flow rates (low 
nL/min) without the need for flow splitting (A. Ducret 
et fl/., submitted for pubUcation). Using this capiUary 
LC-MS/MS system we were able to identify gel-separat- 
ed proteins if low picomole to high femtomole amounts 
were loaded onto the gel [40]. Tltis system is as yet not 
automated and, like all capillary LC systems, is prone to 
blockage of the colunms by microparticulates when ana- 
lyzing gel-separated proteins. 

3JJ CE-MS/MS 

The highest level of sensitiiriity for analyzing gel-sep- 
arated proteins can be adiieved by using capillary elec- 
trophoresis - mass specuometry (CE-MS). We have de- 
scribed in the past a solid-phase extraction capillary elec- 
trophoresis (SPE-CE) system which was used with triple 
quadrupole and ion trap ESI-MS/MS systems for the 
identification of proteins at the low femtomole to sub- 
femtomole sensitivity level [43, 44]. WhUe this system is 
highly sensitive, its operation is labor-intensive and its 
operation has not been automated. In order to devise an 
analytical system with both the sensitivity of a CE and 
the level of automation of LC, we have constructed 
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microfabricated devices for the introductioa of samples 
into ESI-MS for high-seasitivity peptide analysis. 

Tlie basic device is a piece of glass into which channels 
of lQ-30 nm in depth and 50-70 \im in diameter are 
etched by using photolithography/etdung tediniques 
similar to the ones used in the semiconductor industry. 
(A simple device is shown in Fig. 3). The channels are 
connected to an external high voltage power supply [451. 
Samples arc manipulated on the device and off the 
device to the MS by applying different potentials to the 
reservoirs. This creates a solvent flow by electroosmotic 
pumping whidi can be redirected by dianging the posi- 
tion of the electrode. Therefore, without the need for 
valves or gates and without any external pumping, the 
flow can be redirected by simply switdiing the position 
of the electrodes on the device. The direction and rate of 
the flow can be modulated by the size and the polarity 
of the electric field applied and also by the charge state 
of the surface. , 

The type of data generated by the system is illustrated in 
Fig. 4, which shows the mass spectrum of a peptide sample 
representing the tryptic digest of carbonic anhydrase at 
290 fmol/(iL. Eadi numbered peak indicates a peptide suc- 
cessfully identified as being derived from carbonic an- 
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Figure^ 3, Schematic illustratton of a 
mictofabricated aaaiytical system for CB, 
consisting of a miCTomadiined device, 
coated capiltaiy dectroosmoUc pump, 
and miciroelectrospray interfiice. Tlie 
dimensions of the diaonels and reservoir 
are as indicated in the text. The channels 
on the device were graphically enhanced 
to make them more visible. Reproduced 
from [45], with permission. 



hydrase. Some of the unassigned signals may be diemical 
or peptide contaminants. The MS is programmed to auto* 
matically select each peak and subject the peptide to CID. 
The resulting CID spectra are then used to identify the 
protein by correlation with sequence databases. Therefore, 
this system allows us to concurrently apply a number of 
protein digests onto the device, to sequentially mobilize 
the samples, to automatically generate CID spectra of 
selected peptide ions and to sc^ch seq databas^ 
for protein ideiitification. These steps are performed auto- 
matically without the need for user input and proteins can 
be identified at very low femtomole level sensitivity at a 
rate of approximately one protein per 15 min; v v y 

3.4 Assessment of 2-DE-MS proteome technology 

Using a combination of the analytical tedmiques de- 
scribed above we have identified the 80 protein spots 
indicated in Fig. 5. The protein pattern was generated by 
separating a total of 40 microgram of protein contained 
in a total cell lysate of the yeast strain YPH499 by high 
resolution 2-DE and silver staining of the separated pro- 
teins. To estimate how far this type of proteome analysis 
can penetrate towards the identification of low abun- 
dance proteins, we have calculated the codon bias of the 
genes encoding the respective proteins.iCodon bias is a 




Figure 4. MS spectrum of a tryptic digest 
of carbonic anhydrase using the microfa- 
bricated system shown in Fig. 3. 290 
fmol/|iL of . carbonic anhydrase tryptic 
digest was infused Into a Finnigan LCQ 
Ion trap MS. Each peaic was selected for 
CID, and those which were identified as 
containing peptides derived froA car- 
bonic anhydn^ are numbered. Repro^ 
duced from [45], with permission. 
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f7«/r* 5 2 DE seDaration of a lysatc of yeast cells, with idcnUfied proteins highlighted. The first dimension of separation was an IPG from 
Sr^ll^^^^^^^^ dimeS was a 10%T SDS-PAGE gel. Proteins were visualized by silver staining. Further detaUs of cxpenmental 

procedures are included in S. P. Oygi «^ (submitted). 



calculated measure of the degree of redundancy of trip- 
let DNA codons used to produce each amino acid in a 
particular gene sequence. It has been shown to be a 
useful indicator of the level of the protein product of a 
particular gene sequence present in a cell [46]. Hie gen- 
eral rule whidi applies is that the higher the value of the 
codon bias calculated for a gene, the more abundant the 
protein product of that gene becomes. The calculated 
codon bias values corresponding to the proteins identi- 
fied in Fig. 5 are shown in Fig. 6b. Nearly all of the pro- 
teins identified p> 95%) have codon bias values of > 0.2, 
indicating they are highly abundant in cells. In contrast, 
codon bias values calculated for the entire yeast genome 
(Fig. 6a) show that the m^ority of proteins present in 
the proteome have a codon bias of < 0.2 and are thus of 
low abundance. 

This finding is of considerable importance in our assess- 
ment of the current status, of proteome analysis technol- 
ogy. It is clear that even using highly sensiUve analytical 
techniques, we are only able to visualize and identify the 



more abundant proteins. Since many important regula- 
tory proteins are present only at low abundance, these 
would not be amenable to analysis using such tech- 
niques. This situation would be exacerbated in the anal- 
ysis of proteomes containing many more proteins than 
the approximately 6000 gene products present in yeast 
cells [16]. In the analysis? of, for example, the proteome 
of any human cells, there are potentially 50000-100000 
gene products [47]. Inherent limitations on the amount 
of protein that can be loaded on 2-DE, and the number 
of components that can be resolved, indicate that only 
the most highly abundant fraction of the many gene 
products could be successfully analyzed. One approadi 
that has been employed to circumvent these limitations 
is the use of very narrow range immobilized pH gradient 
strips for the first-dimension separation of 2-DE [48]. 
Since only those proteins which focus within the narrow 
range will enter the second dimension of sejparation, a 
much higher sample loading within the desired range is 
possible. This, in turn, can lead to the visualization and 
identification of less abundant proteins. 
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Figure 6. Calculated codoo bias values for yeast proteins. (A) Distribu- 
tion of calculated values for the entire yeast proteome. (B) Distribu- 
tion of calculated values for the subset of 80 identified proteins also 
shown in Figs. 1 and S. Further details of experimental procedures are 
included in S. R Gygi er al (submitted).. 



4 Udlity of proteome analysis for biological 
researdi 

Fbr the success of pFoteomics as a. mainstream approach 
to the analysis of biological systems it is essentia] to 
define bow proteome analysis and biological research 
projects intersect. Without a clear plan for the implemen- 
tation of proteome-type approaches into biological rer 
seardi projects the full impact of the technology can not 
be realized. The literature indicates that proteome anal- 
ysis is used both as a database/data ardiive, and as a bio- 
logical assay or biological research tool. 

4.1 The proteome as a database 

The use of proteomics as a database or data archive 
essentially entails an attempt to identify all the proteins 
in a cell or species and to annotate eadi protein with the 
known biological informatioa that is relevant for each 
protein. The level of annotation can, of course, be exten- 
sive.' The most common implementation of this idea is 
the separation of proteins by high resolution 2-DE, the 
identification of each detected protein spot and the 
annotation of the protein spots in a 2-DE gel database 
format. This approach is complicated by the fact that it is 
difficult to precisely define a proteome and to decide 
whidi proteome should be represented in the daubase. 
In contrast to the genome of a species, which is essen- 
tially static, the proteome is highly dynamic. Processes 
such as differentiation, cell activation and disease can all 
significantly dbange the proteome of a species. This is 
illustrated in Fig. 7. The figure shows two high-resolu- 
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tion 2-DE maps of proteins isolated from rat serum. 
Fig. 7A is from the serum of normal riate, while Fig. 7B 
is from the serum of rats in acute-phase serum after 
prior treatment .with an inflammation-causmg agent [49]. 
It is obvious that the protein patterns are significantly 
different in several areas, raising the question of exactly 
whidi proteome is being described. 

Therefore, a comprehensive proteome database of a spe- 
cies or cell type needs to contain all of the parameters 
which describe the state and the type of the cells from 
which the proteins were extracted as well as the software 
tools to search the database with queries which reflect 
the dynamics of biological systems. A comprehensive 
proteome database should be capable of quantitatively 
describing the fate of ea(di protein if specific systems 
and pathways are activated in the cell. Specifically, the 
quantity, the degree of modification, the subcellular loca- 
tion and the nature of molecules specifically interacting 
with a protein as well as the rate of change of these 
variables should be described. Using these admittedly 
stringent criteria, there is currently no comlete proteome 
database. A number of such databases are, however, in 
the process of being constructed. The most advanced 
among them, in our opinion, are the yeast protein data- 
base YPD 150] (accessible at http://www.ypd.com) and 
the human 2D-PAGE databases of the Danish Centre 
for Human Genome Researdi [12] (accessible at http:// 
biobase.dk/cgi-bin/celis). While neither can be con- 
sidered complete as not all of the potential gene pro- 
ducts are Identified, both contain extensive annotation 
of supplemental information for many of the spot^ 
which are positively identified in reference samples. 

4^ The proteome as a biological assay 

The use of proteome analysis as a biological assay or 
researdi tool represents an alternative approach to inte- 
grating biology with protc^omics. lb investigate the state 
of a system, samples are subjected to a speciQc proceess 
that allows the quantitative or qualitative measurement 
of some of the variables whidi describe the system. In 
typical biodiemicai assays one variable (e.g., enzyme 
activity) of a single component (e.g., a particular en- 
2yme) is measured. Using proteomics as an assays mul- 
tiple variables (e.^., expression level, rate of synthesis, 
phosphorylation state, etc.) are measured concurrently 
on many (ideally all) of the proteins in a sample. The 
use of proteomics as an assay is a less far-reaching prop- 
osition than the construction of a comprehensive pro- 
teome database. It does, however, represent a pragmatic 
approadi whidi can be adapted to investigate specific 
systems and pathways, as long as the interpretation of 
the results takes into account that with current tedindl- 
ogy not all of the variables which describe the system 
can be observed (see Section 3.4). 

A common implementation of proteome analysis as a 
biological assay is when a 2-DE protein pattern gener- 
ated from the analysis of an experimental sample is 
compared to an array of reference patterns representing 
different states of the system imder investigation. The 
state of the experimental system at the time the sample 
was generated is therefore determined by the quantita- 
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tive comparative analysis of hundreds to a few thousand 
nroteins. Comparative analysis of the 2-DE patterns fur- 
thermore highlights quantitative and qualitative differ- 
ences in the protein profiles whidi correlate with the 
stote of the system. For this type of analysis it is not 
essential that ail the proteins are identified or even visu- 



alized, although the results become more informative as 
more proteins are compared. It is obvious, however, that 
the possibility to identify any profein deemed character-' 
istic for a particular state dramatically enhances this 
approach by opening up new avenues for experimenta- 
tion. 
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Figure 7. High resolution 2-DE map of proteins isolated from rat serum with otwiUiout prior exposure to an inflam- 
mation-causing agent. (A) normal rat serum, (B) acute-phase serum from rats which had previously been exposed to 
an inflammation-causing agent. The first dimension of separation is an IPG from pH 4-10, and the second dimen- 
sion is a 7.S-17.5%T gradient SDS-PAGE gel. Proteins were visualized by staining with amido black. Further details 
of experimental procedures are included in (14, 491. 
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Proteome analysis as a biological assay has been success- 
fully used in the field of toxicology, to diaracterize 
disease states or to study differential activation of cells. 
The approadi is limited, of course, by the fact that only 
the visible protein spots are included in the assay, and it 
is well known that a substantial but far from complete 
fraction of cellular proteins are detected if a total cell 
lysate is separated by 2-DE. Proteins may not be 
detected in 2-DE gels because they are not abundant 
enough to be visualized by the detection method used, 
because they do not migrate within the boundaries (size, 
p/) resolved by the gel, because they are not soluble 
under the conditions used, or for other reasons. 

A different way to use proteome analysis as a biological 
assay to define the state of a biological system is to take 
advantage of the wealth of information contained in 
2-DB protein patterns. 2-DE is refened to as two-dimen- 
sional because of the electrophoretic mobility and the 
isoelectric points which define the position of each pro^ 
tein in a 2-DE pattern. In addition to the two dimen- 
sions used to generate the protein patterns, a number of 
additional data dimensions are contained in the protein 
patterns. Some of these dimensions such as protein 
expression level, phosphorylation state, subcellular loca- 
tion, association with other proteins, rate of synthesis or 
degradation indicate the activity state of a protein or a 
biological system. Comparative analysis of 2-DE protein 
patterns representing different states is therefore ideally 
suited for the detection, identification and analysis of 
suitable markers. Once again it must t>e emphasized that 
in this type of experiment only a fraction of the cellular 
proteins is analyzed. Since many regulatory proteins are 
of low abundance, this limitation is a concern, particu- 
larly in cases in which regulatory pathways are bemg 
investigated. 

5 Condudhig remarks 

In this report ^e have addressed three main issues 
related to proteome analysis. First, we have discussed 
the rationale for studying proteomes. Second, we have 
assessed the technical feasibility of analyzing proteomes 
and described current proteome technology, and third, 
■we have analyzed the utility of proteome analysis for bio- 
logical research. It is apparent that proteome analysis is 
an essential tool in the analysis of biological systems. 
The multi-level control of protein synthesis and degrada- 
tion in cells means that only the direct analysis of 
mature protein products can reveal their correct identi- 
ties, their relevant state of modification and/or associa- 
tion and their amounts. Recently developed methods 
have enabled the identification of proteins at ever- 
increasing sensitivity levels and at a high level of auto- 
mation of the analytical processes; A number of tedi- 
nical challenges, however, remain. While it is currently 
pqssible to identify essentially any protein spots that can 
be visualized by conunon staining methods, it is ap- 
parent that without prior enridunent only a relatively 
small and highly selected population of long-lived, 
highly expressed proteins is observed. There are many 
more proteins in a given cell which are not visualized by 
such methods. Frequently it is the low abundance pro- 
teins that execute key regulatory ftinctions. 
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We have outlined the two principal ways proteome anal- 
ysis is cunently being used to intersect with biological 
research projects: the proteome as a database or data 
ardiive and proteome analysis as a biological assay. Both 
approaches have in common that at present they are con- 
ceptually and technically limited. Current proteome data- 
bases typically are limited to one cell type and one state 
of a cell and therefore do not account for the dynamics 
of biological systems. The use of proteome analysis as a 
biological assay can provide a wealth of information, but 
it is limited to the proteins detected and is therefore not 
truly proteome-wide. These limitations in proteomics are 
to a large extent a. reflection of the fact that proteins in 
their fully processed form cannot easily be amplified and 
are therefore difficult to isolate in amounts sufficient for 
analysis or experimentation. The fact that to date no 
complete proteome has been described further attests to 
these difficulties. With continued rapid progress in pro- 
tein analysis technology, however, we antidpate that the 
goal of complete proteome analysis will eventually 
become attainable. 
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Although the nnature neutrophil is one of 
the better characterized mammatlan cell 
types, the mechanlems of myeloid differ- 
entiation are Incompletely understood at 
the molecular level. A mouse promyelo- 
cytlc cell line (MPRO), derived from mu- 
rine bone marrow ceils and arrested devel- 
opmentally by a dominant-negative 
retinoic acid receptor, morphologically 
differentiates to mature neutrophils in the 
presence of 10 pJW retinoic acid. An exten- 

Introduction 



sh^e catalog was prepared of the gene 
expression changes that occur during 
morphologic maturation. To do this, 3'- 
end differential display, oligonucleotide 
chip array hybridization, and 2-dimen- 
sional protein electrophoresis were used. 
A large number of genes whose mRNA 
levels are modulated during differentia- 
tion of IVIPRO celts were identified. The 
results suggest the Involvement of sev- 
eral transcription regulatory factors not 



previously Imptlcated In this process, but 
they also emphasize the importance of 
events other than the production of new 
transcription factors. Furthermore, gene 
expression patterns were compared at 
the level of mRNA and protein, and the 
correlation between 2 parameters was 
studied. (Blood. 2001;98:513-524) 
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Studies of normal myeloid maturation from many laboratories have 
identified genes that may play critical roles in myeloid differentia- 
tionJ-^ Cuirent studies suggest that these events are dependent on a 
cascade of molecular changes that involve complex modulation of 
mRNA transcription. Furthermore, studies of acute leukemia have 
suggested that the disease arises from the accumulation of myeloid 
precursois airested at early stages of differentiation and associated, 
in many cases, with chromosomal rearrangements that aher the 
structure of specific transcription factors.^ Nevertheless, the molecu- 
lar events underlying the production of mature myeloid cells are 
not well understood and appear to use inteiactmg pathways and 
netwoiks, the elucidation of which requires an extensive descrip- 
tion of the molecular components available to the myeloid cell. 

An extensive body of information is accumulating with respect 
to gene expression profiles of mammalian cells. However, much of 
the information available in public databases has been accumulated 
by the use of techniques such as single oligonucleotide chips or 
cDNA arrays that measure fewer than 6000 of potentially 30 000 to 
1 20 000 transcripts. The more limited range of analyses reported by 
the serial analysis of gene expression (SAGE)<^^ technique accu- 
rately estimates changes in levels of the more abundant mRNAs but 
requires extensive redundant analyses to measure changes in the 
patterns of expression of scarce mRNAs. We have used a modified 
polymerase chain reaction (PCR)-based cDNA differential display 
(DD) method in which single restriction fragments derived from 
the 3' end of cDNAs are separated on a sequencing gel.''*^ Bands 
from the gel can be identified iiiitially by sequencing, but then 



comparison of patterns from different samples can be made without 
further sequencing. This sensitive and reproducible method detects, 
in principle, most cDNAs regardless of whether they are repre- 
sented in existing databases. 

Systematic analysis of the function of genes can also be 
performed at the protein level. This approach has the advantage of 
being closest to function, because proteins perfoim most of tlie 
reactions necessary for the cell. The most common method of 
proteome analysis is the combination of 2-dimensional gel electro- 
phoresis (2DE) to separate and visualize protein and mass spectrom- 
etry (MS) for protein identification. Several such analyses of 
yeast and of normal or malignant mammalian cells have been 
performed. To date, however, there have been few studies in which 
both mRNA and protein have been compared by applying analyses 
to the same samples. The studies of Anderson" and Gygi'^ showed 
that there is not a good correlation between mRNA and protem 
levels, in yeast or human liver cells. However, other analyses 
disagree with this conclusion (Greenbaum et al, manuscript 
submitted, and Futcher et al"»). Furthermore, global correlations 
between changes in mRNA and protein levels have not been 
examined during the execution of any developmental program. 

The MPRO cell line was derived by transduction of a dominant- 
negative retinoic acid receptor constmct into normal mouse bone 
marrow cells. It is a granulocyte-macrophage colony-stimulating 
factor {GM-CSF)-dependent line arrested at a prorayelocytic stage 
of development. '5.1 6 Afler treatment with a\\-trans retinoic acid 
(ATRA) most of the cells acquire the morphology of mature 
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neutrophils and begin to produce neutrophil lactoferrin and gelati- 
nase, 2 proteins characteristic of neutrophil secondary granules.'^ 
As such, it oflfeis a valuable model for studying neutrophil 
differentiation in vitro. 

We now report the analysis of mKNA expression changes 
during the process of MPRO ceil maturation to neutrophils and 
compare the results with a limited analysis of cellular protein 
composition. mRNA expression changes were studied by combhi- 
ing the use of oligonucleotide arrays and DD. A database (dbMC) 
with comprehensive genomic information for myeloid diffeientia- 
tion program was constmcted (accessible at http://www.bioinfojnbb. 
yaIe.edu/expression/nButrophil). We have grouped the changes in 
mRNA levels of a large number of genes into 6 patterns, with 
implications for the genetic program of myeloid differentiation. 

We also compared 2.dimensional high-resolution gel electro- 
phoretograms hora control cells and cells differentiated for 72 
hours in the presence of ATRA, Fif^y protein spots whose relative 
intensity changed prominently during differentiation weie exam- 
ined by mass spectrometry. The results suggest a poor correlation 
between mRNA expression and protein abundance, indicating 
that it may be difficuh to extrapolate directly from individual 
mRNA changes to corresponding ones in protein levels (as 
estimated from 2DE). 



Materials and methods 

Cell lines 

MPRO cells and HM-5 cells provided by Dr Schickwann Tsai (Fjred 
HutchmsoQ Cancer Research Center, Seattle, WA)" were used throughout 
the study. The cells fWoUferated continuously as a OM-CSF-dependent cell 
line at 3TC in Iscovcs modified Dulbecco medium (Gil>co BRL, Grand 
Island, NY) supplemented with 5% to 10% fetal calf serum (Oibco BRL) 
and 1 0% HM-5-conditioned medium as a source of GM-CSF, Morphologic 
differentiation of the blocked MPRO promyelocytes was induced by 
treatment with 10 p,M ATRA (Sigma, St Louis, MO). Controls were 
cultured in the absence of ATRA but with the same volume of ve- 
hicle (ethanol). 

RNA Isolation and differential display 

After exposure to 10 jtM ATRA for 0, 24, 48, or 72 hours, total cellular 
RNA was isolated from MPRO cells using TRIzol reagent (Life Technolo- 
gies, Gaithersburg, MD). cDNA was then synthesized using a T-7 Sal-Oligo 
d(T) 32 primer as described previously,"-" The double-stranded cDNA was 
digested with I of 9 difterent restriction enzymes {Aped, Bgia, BamHl 
Eagl, EcoKl, Hinmi, Xhal, Kpnl, and Sphl) and ligaled to Y-shapcd 
adaptors with a complementary overhang. DNA fragments were then 
amplified by PGR as described previously PGR products were separated 
on a sequencing gel of 6% polyaciylamide with 7 M urea. The gel was dried 
and exposed to x-ray fihn. Genes from differential display gels, whose 
maximum intensity changes equaled 2+ on a scale of 1+ to 8+, were 
recorded as significantly changed.*' Individual DNA bands were recovered 
from the gels, amplified by PCR, and sequenced. 

Oligonucleotido chip analysis of RNA samples 

Ten micrograms total RNA from each sample (0, 24, 48, or 72 hours) was 
used to prepare cDNA, This cDNA was transcribed with T7 RNA 
polymerase to prepare a fluorescentiy labeled pioh&,^*^^ Each sample was 
hybridized to mouse array chip (MuJiK Array; Afiymetrix, Santa Clara, 
CA) containing oligonucleotide probe sets corresponding to approximately 
7000 known genes or ESTs represented by UniGene clusters.^ cDNAs 
were considered present if their probe set results were rated as such by the 
OeneChip software (Afiymetrix) and if the average difference (AD) 
between perfect match and mismatch probe pairs was not less 100 U. If a 



gene was represented by more than one array probe set, the average of aU 
probe sets for the gene was takca Genes with AD values between 100 and 
200 were considered unchanged because of their low expression levels. 
Those genes with AD values equal to or more than 200 U at one time point 
were further shidied by rcscalmg, threshold, and nonnalization methods 
described in the MIT Center for Genome Researeh Web site." A value of 20 
was assigned to any gene with an AD below 20 at some time point. 

Bloinformatics and database development 

All the sequences or gene fragments were searched using Blast against 
OenBank and TIGR gene indices. A database of genes or ESTs whose 
expression levels changed during myeloid differentiation was constructed 
containing information for each band or gene. This included OenBank 
matches, Locus Link or Unigene chisters, expression patterns, tissue 
distribution, synonym(s) protein name, gene namc(s), notations of possible 
fimctions, poly A signal and sequence quality, and hyperlinks to the 
database searches, sequence trace files, and related references. All gene data 
were then gathered into a chister file. Supplementary inforaiation is 
available at httpr/Aioinfo. mbb.yale.edu/expression/neutrophil. 

Classification and analysis of DNA fragments 

Sequences from differential display analyses were classified as representing 
known genes, ESTs, genomic sequences, or novel genes as described."-" 
Known genes from both differential display and arrays were clustered into 
27 fiinctional categories and searched against SWISS-PROT (http:// 
www.expasy.cbrJUc,ca/cgi-bin/sprot-5earch-fiil) or PIR (http://www.pir 
georgetown.edu/). Information such as function, subcelliUar location, 
family and superfamily classification, map position, similarity, synonym(s) 
protein name,' gene name(s), and so on was recorded in a variety 
of databases. 



Northern blot analysis 

Thirty micrograms total cellular RNA per lane from time-course MPRO 
cells were loaded onto 1.2% formaldchyde-agarose gels, then Iraasfeired to 
Hybond-N+ membranes (Amersham Pharmacia Biotech, Uppsala, Swe- 
den). After standard prehybridization, membranes were hybridized over- 
night at 65'C with radiolabeled cDNA probes (ordered from Research 
Genetics according to their dbEST Image ID). Membranes were washed at a 
final stringency of 60°C in 0. 1 X SSC. 

Immobilized pH gradient 2-dimenslonal gel electrophoresis 
and mass spectrometry 

Induced MPRO cells collected at 0 and 72 hours were lyscd with lysis 
buffer (540 mg urea, 20 mg dithiothreitol, 20 Pharmalyte [3-10), i.4 mg 
phenylmethylsulfonyl fluoride, I jig each aprotinin, leupeptin, pcpstatin A, 
and antipain 50 p-g TLCK, and 100 p,g TPCK/l mL). We appUed 100 p.L 
each MPRO ceU lysate (2.5 X 10*cells/100 p^L) to immobUired pH 
gradient (IPG) strips (pH 3-10 L; Amerslwm Pharmacia Biotech), and IPG 
electrophoresis was conducted for 16 hours (20 100 Vh) using an Immobi- 
line Drystrip Kit (Amersham Pharmacia Biotech). Electrophoresis in the 
second dimension was carried out in a 12% sodium dodecyl sulfate- 
polyacrylamide gel electrophoresis (SDS-PAGE) gel with the Laemmli- 
SDS continuous system in a Protean 11 xi 2-D cell (Bio-Rad) mn at 40 mA 
constant current for 4,5 hours. Proteins were detected by Brilliant Blue 
G-colloidal staining." Protein spots were exceed from the gel and digested 
with trypsin. ACTH cUp (average (M+H) 2466.70) and bradykinin 
(average [M+H] 1061.23) were used for calibration of peptide masses. One 
microliter sample digest was mixed with 1.0 nL a-cyano-4-hydroxy 
cinnamic acid (4.5 mg/mL in 50% CH3CN, 0.05% TFA) matrix solution and 
1 M-L calibrants (100 fmol) each. The spectra of the peptides were acquired 
in reflector/delayed extraction mode on a Voyager-DE STR mass spectrom- 
eter (Perseptive Biosystems, Foster City, CA). Peptides were identified 
using the Propound search engine,^^ 
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Results 

Differentiation of MPRO celts 

Figure 1 illustrates the morphologic changes in an MPRO cell 
population representative of those used for RNA expression 
analysis. Undifferentiated MPRO cells resembled promyelocytes 
under the light microscope (Figure I A). After induction withATRA 
for 24 hours, the cells morphologically differentiated into metamy- 
elocytes (Figure IB). At 48 hours, the cells further developed 
into metamyelocytes and band neutrophils (Figure IC). At 72 
hours, neaiiy 100% of MPRO cells became mature neutrophils 
(Figure ID). 

Identification of mRNAs by dtfferentlat display assay 

MPRO cellular mRNA was analyzed at 0, 24, 48, and 72 hours after 
ATRA treatment. Nine restriction enzymes were used in a 3 '-end 
DD approach. During MPRO differentiation, 1109 fragments 
corresponding to 837 transcripts were found to change substan- 
tially in expression levels (Figure 2). These represented approxi- 
mately 279 known genes, 112 ESTs, and 59 putative new genes, 
each with a perfect or fair polyadenylation signal at an appropriate 
distance from the oligo-dT primmg site. The gene infoimation 
detected by DD was collected in database dbMCd. 

identification of mRNAs by oligonucleotide chip assay 

We used an oligonucleotide chip containing 13 179 probe sets 
corresponding to approximately 7000 murine genes to analyze 
patterns of mKNA expression in the same RNA samples used for 
DD. The information obtained by oligonucleotide arrays was 
collected in the database dbMCa. 

We clustered the genes by their similarity to idealized 
expression patterns. For instance, the expression pattern of an 
ideal gene that is overexpressed (high) at time 0 and underex- 
pressed (low) at 24, 48, and 72 hours, would be high-low-low- 
low (HLLL), Overall we have (2^-2) idealized patterns exclud- 
ing HHHH and LLLL. Pearson correlation was used as the 
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Figure 1. Morphology of MPRO cslls durfng dlfforsnttation. MPRO cell& were 
induced as described In "Materials and mHthods," concentrated by cytospin, and 
Wrlght-Glemsa stained. (A) Unlnducad MPRO colte. (B) MPRO celts Induced with 
ATRA for 24 hours. (C) MPRO cells Induced with ATRA for 46 houfB. (D) MPRO cells 
Induced whh ATRA for 72 hours. 



Figure 2. Distribution of gAnes obtained by DO assay. MPRO cell mRNA was 
analyzed at 0. 24, 48, and 72 hours after ATRA treatment; 1109 fragments 
corresponding to 837 transcripts were found to change substanUally In expression 
levels. The total 837 tiBnscripts were dasslfied Into 6 categories according to the 
btolnfomwUc analysis. Percentages show the gene distributions In these 6 catego- 
ries. Information fbr each transcript was collected In database dbMCd. 



measure of similarity of each gene expression pattern, 
X ^ (x\,xi,x^^4) to each of the 14 idealized patterns 
y ^ (yix>'20'50'4)- The 4 entries of x and y corresponded to tlie 
4-dimen5ional gene expression levels at 0, 24, 48. and 72 hours, 
respectively. Each gene was assigned to a cluster labeled by the 
idealized pattern that had the maximal correlation with that 
gene. We selected only genes that hybridized well compared 
with the background (considered "present" by GeneChip soft- 
ware) and had maximal AD amplitude greater tlian 200 U in at 
least 1 of the 4 stages. We further tabulated the 14 patterns 
according to whether the gene expression changed at early 
(O-hour), intermediate (24- and 48-hour), and late (72-hour) 
time points and whether gene expression monotouically in- 
creased (up-regulated), monotonically decreased (down-regu- 
lated), or was not monotonic (transient). Table 1 shows 8 
clusters of 104 genes that had significant changes of mRNA 
levels, arranged according to the temporal stage and the 
monotonic/transient changes of expression levels. 

Principal component analysis detennined whether we could 
comprehensively present multidimensional data (4-dimensional in 
our case) in a simple 2-dimensional graph. First, we found the 4 
principal components, which were the axes of the most compact 
4-dimensional ellipsoid that encompassed the 4-dbnensional Cloud 
of data. Each axis was a diflferent linear combination of the original 
4 variables. Then we verified Uiat tl)e first 2 prijicipal components 
(the first 2 largest axes of Uie ellipsoid) captured most (95.2%) of 
the variation of the data. Therefore, the data could be faithfiilly 
projected (with a minor loss of information) into a 2-dimensipnal 
graph, with the 2 laigest principal components as the x- and y-axes. 
As shown in Figure 3. genes tend to coalesce in clusters, according 
to their labels determined by Uieir similarity to an ideal expression 
pattern. In summary, a genomic (global) picture of the distribution 
of genes according to their similarity to predetermined idealized 
multidhnensional expression patterns is concisely displayed in a 
2-dimen5ional graph. 
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Table 1. Qenes differently regulated during the cUfffarBnt stagps of mouse prom/elocytlc eel) tine dlfferentl&tion process 



Timing 



Category 



Earty 



Middle 



Up-reguiatlon 
Down-regutation 



LHHH(n= 10) 

MGd^Zrx) Itgb2 Il1r2 Lcn2 ttpeS 

CebpbH2-DEtohl6Zyx 
HLLL(n= 11) 

Tcrg>V4 Ly64 Osg Spl2-1 McptS 
Myc Myb Tlr4 Npml Erh Hsp60 



Transient 



LLKH(n = 6) 

PiratCytb Pfc PiraS Cd53 tfnor? 

HHLL(n = 1) 
Mpo 



LLHL (n = 9) 

Sell Ktf2 PIrae Ptrb Lsti Ltf Seina4d State Mmp9 
LHHL(n = 17) 

CebpB Lyzs Fcgr3 Arf5 Lampi Stat3 C8f2ra Oai 
Actg Sfpil Gpx3 Ptprc PrtnS Irf1 RpsSkal 
Ltb4rMytn 



LLLH (n - 13) 

nie CsftrW Cf5/S100a8 L-CCR Ctss 

Aldol Rftc2 Fpri Ctsd Ubb Ptmb4 
HHHL (n«37) 

Actx lf(2£l2 Rpllfl Actb Ly6e Atfim\2 
Psma2 Onas Zfp36 tt4m Lrf>rShfdg1 
Max Rps8 Csf2rt)l SIpl Tetexl Tpl Btf3 
Cn/r Gys3 SIcI Oai Ctsb Seppi Rtn3 
Ccnb2 S 1 00a9 CT1 1 Hl5t5-2ax Rola 
Copa Ostml arib2-rs1 Om RPLB 



Arrays of Affymatrix Mulik containing 13103 probe seta correspondlno to 12002 GenBank accessions were used for hybrWIzatlon. Arrays were hybridized with 
streptavk) tn^hycoerythrin {Molecular Prcbes) blotln-labeted RNA and scanned. Intensity for each feature of the artsy was captured using Genechip software {Aflymetrtx) and 
a single raw expression level for each gene was derived from the 20 probe pairs representing each gene using a trimmed mean algorithm. For each gene, an AD of 24- 4S-* and 
72-hour samples was calibrated by dividing the slope of the linear regression line for a graph with the x-axis the AD of 04>our probe sets and the y-axis the AD of the r^spelrtive 
time point (24, 48, or 72 hours). A threshold or20 U was assigned to any gene with a calculated expression level bebw 20 because discrimination of expresaton below this level 
oould not be performed with confidence » Each gene expression profile was categorized as described In Tables 3. 4, and 5. For the 4 time points, the minimum AO of the 
retativeV higher group (MIN-H) was divided by the maxlnnum AD of the relatively low group (MAX-L). and those genes whose MIN^MAX-L greater than 2 were setocted as 
meaningfully regulated. Genes were sorted in desoending orter based on the MIN-H/MAX-L Genes In boldface are those whose expression level was m the top 20% (te 
maximum AD of 4 «ma points greater than 3000). and genes In italics are those tn the bottom 20% (le. maximum AD of 4 time points less than 300). The differentiatbn oerlod 
was grouped Into 3 stages: earty (O-hour), middle (24-hour and 48-hour), and late (72-hour) stages. 

AD indicates average difference; gene symbols are expanded In an Appendix at the end of this article. 




Figure 3. Gene clustsrs tn the finrt 2 principal component spaces. Principal 
component analysis allowed us to present the multidlmenslonat data (in this case, 
4-dlmen8iofwI data of each gene expresston pattern) in a simple 2-dirT»n8lonai 
graph. We derived the 4 principal components, which are a linear combination of the 
standardized expression Intensities (zero mean and unit variance) at 6. 24. 48. and 
72 hours. The first 2 prtndpal components captured most of the variation of the data 
(approximately 86%). fherefbre. (he data can be displayed (with a minor loss of 
intbmiation) In a 2-dimerwional graph. The first and second principal components, c1 and 
c2, are givcntiythe Inear combinations ct = 0.747 • n1 - 0.11 ' n2 - 0.666 • n3 + 0 ■ 
n4andcfe = 0.278 • r?1 + 0.353 • n2 + 0.233 * n3-0^ • n4, wherenl. n2. n3. 
and n 4 are the rescaled and stendardtzsd expression levels at 0. 24, 48, and 72 
hours, respecthmly. The axes legends c1 and c2 stand for the first 2 principal 
components. In this paper we used the Pearson oorrotallon to measure the similarity 
of each gene with the Idealized expression patterns, as opposed to the Euclidean 
distance we used In a prevtous work," because dusters wore better separated using 
thb measure. In both cases, vre presented the data In the 2-dimenslonal space of the 
lowest principal components. XhB data had a tendency to be circularly distributed 
when we used the Pearson oorrelstion as a distance measure. 



Correlation between array and DD analyses 

We have previously demonstrated a correlation coefficient of 0.93 
between visual estimates of changes in band intensity on DD and 
Phosphorimager System (Molecular Dynainics, Sunnyvale. CA) 
estimates of band intensity and a correlation coefficient of 0,88 
between hybridization intensity changes of inRNA on Northern 
blot analyses and changes in band intensity on DD." In a few cases 
there were clear discrepancies in tlie pattern of expression of a 
gene, as estimated by DD and by oligonucleotide chip analysis. We 
chose the 6 most extreme cases and examined Uie levels of niRNA 
change for these genes by Northern blot analysis (Figure 4). In 5 
cases, tlie Northern blot results agreed with the results of tlie DD 
analysis, whereas the results of Gnb2-rsl disagreed with the 
oligonucleotide array but duplicate bands from DD showed a 
relatively high level of expression in the 0 tune sajnple that did not 
correlate with the Northern blot (Table 2). One possible explana- 
tion for these findings was t3ie change in the relative use of different 
polyadenylation sites after the addition of ATRA to the MPRO cells. 

Constructing a database for mRNA level changes during 
myetotd differentiation 

Based on the data obtained above, an in-house database (dbMC) 
was constructed that included 2 subdatabases, dbMCd and dbMCa. 
for collecting gene information from DD or oligonucleotide arrays, 
respectively. Each entry in dbMC is accompanied by a so-called 
executive summary. The linkage between dbMCd and dbMCa was 
established by UniGene ID and cluster ID. dbMC contains the 
temporal expression patterns of genes during the MPRO cell 
dijOferentiation process, including not only products represented in 
public databases but also novel transcripts. 
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Adtb il 
RibQsomal RNA 



-2BS 
-18S 



Rgura 4. Northam blot analysis of sslectsd mRNAs. Equ^alB^t amounts of RNA 
from MPRO cdb indiwed by ATRA at dlfferanl tbtie points (0 hour, 24 hours, 48 hours, 
and 72 hours) were resolved by formaldehyde-agarow gel electrophorssis, stained 
to verify the amount of loading. Ebvon genes were separately probed on the RMA 
fitters. The gene symbol of each probe was listed at the left of a related Northern btot 
result. Detailed information on these 11 probes was Dsted In Table 5. One of the 
RNA-blotted membrane photographs Is shown with methylene blue-stained 2SS and 
18S RNA subunits demonstrattno the quality and quantity of RNA loaded in 
Individual lanes. 



Analysis of {fene expression patterns during MPRO 
differentiation 

Many of the genes identified in this study were found in myeloid 
cells or were implicated in myeloid development for the first time. 
We detected 8 cytokines^ and chemokines whose inRNA levels 
changed more than 5-foId by arrays and 2-foId by DD during the 
maturation of MPRO celb (see our Web site. httpy/bioinfo.mbb. 

Table 2. Expression patterns of genes detected by Northern blot analysis 



yale.edu/expression/neutrophil). Among these were 2 members of 
the CC chemokine family. Interleukin-la (IL-la) was up-regulated 
at the late stage of differentiation (LLLH pattern. Table 1). 

mRNA for approximately 52 receptors was detected by one or 
the other method. A number of the receptors known to be present on 
mature neutrophils showed late induction of mRNA. and their 
levels of induction were high, indicating that the expression of 
these products is a prominent event late in neutrophil maturation 
(Table 3). Rarely was mRNA for receptors down-regulated, 
consistent with myeloid maturation being accompanied by increas- 
ing responsiveness of the cell to a variety of external stimuU . 

Expression of mRNA for granule proteins 

Neutrophils contain several types of granules that develop at 
different stages of myeloid maturation.^^'^-^^ Levels of mRNAs 
encoding secondary granule proteins, such as lactofeirin. increased 
as the cells matured (Table 4). Tlie level of mRNA for Mmp9. 
reported as a tertiary granule protein, increased markedly between 
24 and 48 hours after the induction of differentiation, whereas 
mRNAs for secondary granule proteins either increased less 
markedly or showed a maximum increase by 24 hours, mRNAs for 
several primary granule constituents, such as myeloperoxidase and 
cathepsin G. were present in unstbnulated cells and decreased as 
the cells matured. There was a discrepancy in the measurements of 
proteoglycan mRNA by DD and oligonucleotide chips, but North- 
ern blots showed that it reached a peak at 48 hours and then 
declined (Figure 4). Cathepsin D is reported as a primary granule 
protein, but its pattern of mRNA expression more closely re- 
sembled, that of secondary granule constituents. In addition to 
known granule components, mRNAs for several other cathepsins 
were up-regulated during myeloid differentiation, in parallel with 
or later tlian the tertiary granule protein mRNAs. 

mRNAs for transcription factors 

Transcription factor genes, including several identified at the sites 
of consistent chromosome rearrangements in acute myeloid leuke- 
mia, have been implicated in nonnal myeloid differentiation and in 
the expression of neutrophil proteins.^^ " However comprehensive 
mfoimation concerning the expression of these transcription fac- 
tors during myeloid development is not readily available. There- 
fore, we compared gene names and identifiers in our databases to 
those of the transcription factor database Transfac (http:// 



Gene 


Gene 




AD value by array 






Intensity by DD 




symbol 


accession 


Oh 


24 h 


48h 


72 h 


Oh 


24 h 


48 h 


72 h 


Cebpa 


m23B2 


33 


212 


182 


44 










Cebpb 


X62600 


390 


1248 


1380 


1903 










Cebpd 


xeisoo 


1S7 


262 


168 


430 










Cebpe 




















1^ 


M1284e 


892 


356 


230 


435 










SIpl 


U73004 


617 


601 


783 


402 




2 


3 


3 


Prga 


W45a34 


153 


259 


339 


345 




1 


1 


2 


Gnb2-rs1 


X75313 


4231 


3623 


3215 


3403 




4 


1 


1 


Ly6e 


U042$8 


3061 


5391 


2844 


1282 




2 


1 


1 


Lspl 


M90316 


65 


376 


640 


28 




3 


5 


6 


Actb 


X03765 


309S 


3588 


3976 


2434 




2 


3 


2 



Gene symbol and gene accession refer to National Center for Bloteohnotogy tnfomiation databases and, In particular, to Locus Unk. AD value is the average difference In 
the value of hybridization intensity between the set of peffecUy matched oligonucleoUdes and the set of mismatched oUgonudeoUde in the oflgonudeotide array Band 
IntensWes from DD were semiquantified on a scale from 1 (+) to 8 (+ +++++ ++). These estimates are shown as boldface numbers In this tablo,^** Both AD value and 
tntensity of genes were studied at 4 time points correaponding to MPRO cells Induced for the Indicated tlmea. 

DD indicates differenUal display; MPRO, mouse promyelocytic cell tine; for gene symbols, see the Appendbt at the end of this artld©. 
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Tabia 3. Reoptors expreased during myeloid differentiation procftsa 



Maximal fold change 



Less than 2 



2 or more, less than 3 



3 or more, toss than 4 



4 or more, less than 6 

5 or more 



AD value by array 



Gene symbol 


Gene accession 


Oh 


24 h 


48h 


72 h 


Bzrp 


D21207 


641 


668 


881 


887 


Cnikar4 


X99581 


508 




378 


684 


Crry 


M34173 


•MO 


384 


506 


606 


Csf2rt1 


M34397 


aiO 


345 


410 


241 


HtrSa 


218278 


188 


272 


273 


339 


M6pr 


VQ^fJftQ 
AWUOO 


536 


409 


408 


649 


MPPIR 


AA1 1B78a 


232 


84 


63 


381 


TCRGB 






212 


244 


299 


Tnfrsfia 


M69377 


0 


1 


1 


1 


Cmkbrl 


U28404 


221 


244 


604 


638 


Crhr 


Af £OUO 


121 


200 


250 


355 


Csf2ra 


tVIOOUf B 


171 


372 


402 


264 


Ebia 




187 


270 


428 ' 


148 


Gridi 


D10171 


128 


164 


150 


257 


Ifngr 


J06266 


141 


263 


327 


261 


II2f9 


U21795 


205 


184 


231 ' 


477 
3968 


Ldlr 


X64414 


1399 


1653 


1665 


P40-8 


J02870 


849 


677 


381 


640 


Plaur 


X62701 


312 


443 


476 


734 


Rarg 


M34476 


102 


113 


114 


218 


Srb1 


U37799 


126 


232 


132 


258 


Cr2 




83 


138 


243 


77 


CsfZrbZ 


M2966S 


209 


249 


437 


111 




J05020 


2398 


2766 


3365 


8761 


Pcor2b 


X0464d 


1703 


1662 


1431 


4605 


Ifngr2 


U 69699 


1 


2 


2 


3 


Nr4a1 


VlROOR 
/\ iDtnfu 


96 


186 


202 


401 


I11r2 




482 


1796 


2872 


3818 


C6r1 


LOS^O 


186 


434 


808 


1078 


Drd2 


X65e74 


0 


0 


0 


219 


Fcgr3 


M14216 


1 


1 


1 


2 


Fpr1 


122181 


0 


69 


141 


671 


GCR 


AA240711 


2 


0 


0 


0 


L-CCR 


AA034646 


48 


176 


314 


2066 


NMDARGB 


AAB20211 


2 


2 


0 


0 


P2rx1 


Xe4696 


79 


346 


530 


744 


Ptral 


U96682 


0 


43 


172 


378 
1874 


PIraS 


U96e86 


,274 


391 


954 


PiraB 


□96687 


122 


635 


2014 


1716 


Plrb 


U9d689 


191 


445 


966 


747 


Sell 


M26324 


48 


104 


570 


20 


TcrEhV4 


M54996 


16S0 


78 


65 


315 



IS^^^ltTt? ?1 ™ximal fold Change of AD values. Abbreviations of gene names are taken from gene symbob listed (n the Locus Link f^3me N^c^Tal 
Biotechnology tnfom>aUon database v^era available. Numbers In bokj denote tho« gene expressten paVtoms obtained by differential d^tey raLr ttSn bv 
ollgonucteotWe array assays. The other information Is presented as In the legend to Table 2. ^ ^ " '^'^P'^y by 

AD indtoates average difference; gene symbols are expanded In an Appendix at the end of this srtide. 



www.transfac.gbf-bTaunschweig.de/TRANSFAC) and deteimined 
which factors contaiiied in this database were present at detectable 
levels in MPRO cell mRNA, usmg Affymetrix software for the 
criteria for inclusion of mRNAs from approximately 200 murine 
transcription factors probe sets on the oligonucleotide chip. Of 
these, 54 were expressed and 13 showed changes of 3-fold or more 
in chip signal (Table 5). 

The changes in certain transcription iactois» such as the moderate 
down-regulation of myb and myc and the up-regulation of the Max 
dimerization protem MAD, were consistent with the shift of the cells 



from a proliferative to a differentiated state.^*" Some changes are more 
difficult to explain, such as the up-regulatiou of DPI, a partner for E2f 
fectors in the regulation of S-phase genes, and tlie mild up-tegulation of 
^e/rf genes, commonly associated wilh an inliibition of differentiation 
by competition with bHLHuanscriptional activators.^' 

The C/BBP family has been extensively studied with respect to 
myeloid differentiation.^*^ Absolute levels of the C/EBP a and B 
mRNAs were low, probably at the borderiine of significance for the 
oligonucleotide chip assay, whereas the level of C/EBP 3 appeared 
higher In addition, there were discrepancies between the cliip 
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TabiB 4. Oranula constituents expressed during mouse promyelocytic cell line cell differentiation 





Gens symbol 


Gene accession 


Oh 


AD value by array 
24h 48h 


72 h 


Azurophil (primary) granules 
















Man2c1 


AA1618B0 


178 


134 


99 


164 




Ctsb 


M65270 


442 


480 


695 


389 




Ctsd 


X52B6e 


214 


1087 


1628 


2784 




Ctsg 


M9e801 


1509 


405 


46 


266 




E12 


U04g62 


668 


1273 


843 


157 




E1a2 


AAB89Qie 


47 


159 


134 


163 




GUS4 


Me3836 


544 


226 


266 


264 




Lyzs 


M21050 


0 


1 


1 


3 




McptS 


X78545 


831 


268 


66 


491 




Mpo 


X15378 


3788 


3000 


776 


692 




Pro 


Xiei33 


2621 


2653 


2920 


9859 


Poaslbia granule proteins 












Ctsc 


AA1 44887 


252 


194 


342 


676 




Ctse 


Xg7399 


1 


3 


4 


5 




Ctsh 


U06119 


46 


124 


195 


166 




Ctsi 


X06086 


16 


11 


31 


237 




Ctss 


AA089333 


12 


9 


88 


463 


Specific secxjndary granules 














Cpa3 


J0511B 


621 


270 


90 


801 




Cd36l2 


ABOOdSSS 


113 


93 


157 


10/ 




Cnip 


X94363 


80 


479 


704 


626 




Cybb 


U43384 


8 


24 


91 


128 




Ear2 




0 


1 


1 


2 




Fpri 


L22161 


178 


220 


235 


846 




ltgb2 


X149S1 


0 


2 


4 


2 




Lcn2 


W13166 


916 


3513 


3931 


6036 




Ltf 


J0329d 


19 


162 


333 


138 




MBP 


W46834 


5 


1 


1 


2 




Mnnp13 


X66473 


44 


43 


72 


178 


Tertiary granules 


Ngp 


L37297 


2661 


4782 


2311 


6912 












Mmp9 


227231 


0 


1 


2 


2 



Shown are the possible granule protein cDNAs represented on the oHgionudeotlde arrays, sorted by Ihoir expression patterns as follows: first 
value, then by the granule types, and last by the alphabetical order of gene symbols. Data ere presented as described in the legend to Table 3. 
AD indicates average difference; gene symbols are expanded In an Appendix at the end of thb article. 



by the average dHference AO 



estimates and the mRNA levels observed by NoTthem blotting with 
specific probes for these genes. In particular, the latter method, 
more sensitive and specific, showed that C/EBP a began to decline in 
the most mature cells, whereas C/EBP 8 mRNA declined progressively 
beginmng at 24 hours after the onset of differentiation. 

C/EBP € is a more lecendy cloned C/EBV family member. Previous 
studies indicated it is expressed in a large array of human leukemia cell 
lines blocked at various stages of differentiation and that it is up- 
regulated during granulocytic differentiatioa^' A C/EBP € piobe was 
not included in the oligonucleotide chips, and diis mRNA was not 
detected by DD. Tlierefore, we examined the C/EBP e expression 
patterns by quantitative PCR and Noithem blot analysis (Figure 4). 
C/EBPeexonl was PGR amplified fiom MPRO RNAs using primers 
RY48 (AGCCCCCGACACCCrrGATGA) and RY49 CTCGCACACT- 
GCGGGCAGACAG)." The resuhs showed that C/EBP € is expressed 
tliToughout myeloid differentiation, with expression levels increased 
moderately in later stages. 

We detected a number of other transcription factors that are 
broadly expressed or that have been reported in other studies of 
hematopoiesis (Table 5). Some of the factors that were most 
strongly induced during differentiation have been studied in other 
contexts but not previously implicated in hematopoiesis, such as a 
majnmalian homologue to the Drosophih enhancer of split gene, a 
transcriptional silencer. The mammalian gene is expressed at 
relatively high levels as measured by the oligonucleotide chip and 



is a candidate for mediation of the silencing of giov^-related 
genes in the maturing neutrophil. Another candidate transcriptional 
silencer, Tiflb, may serve as a corepressor for the KRAB domain 
family of zinc finger transcription factors and also may mediate 
binding of the heterochromatin protein HPl to DNA.^^ 

There were 26 transcription factors whose mRNAs showed no 
significant changes by oligonucleotide chip analysis and were not 
identified as differentially regulated genes by differential display 
assays. PUl, a factor necessary for the production of neutrophils 
and the expression of several neutrophil genes,^ showed less than a 
3*fold increase in mRNA, below the threshold for a significant 
change. Other candidate hematopoietic transcription factors, such 
as PEBPlaB2 (AMLl), GATA-1, and SP-2, were represented on 
tlie oligonucleotide chips, but their mRNA levels were so low that 
they were reported as absent in this study. The possibility that small 
changes in the levels or ratios of some transcription factors could 
produce maiiced changes in transcription potentially limits tlie 
ability of data generated by present metliods to explain transcrip- 
tional changes during differentiation. 

Protein expression patterns of MPRO celts during 
ATRA Induction 

We visually compared the 2DE patterns from MPRO cells at the 
same time points used for mRNA analysis. In most cases the 
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Tabid 5. Trahscription modulatore prasanted during myetold dlffenntiatior) 



AD value by array 



Msxlmal fold chanQS 


Oens symbol 


Gene accession 


Oh 


24 h 


48h 


72 h 


Less than 2-fold 
















Zfp11-6 


AB02aS42 


2630 


2989 


2795 


2615 
1 




Btf3 


W13602 


3 


3 


£ 




Oata2 


AB000096 


G62 


770 


472 


730 




Hmgl 


J04179 


337 


348 


177 


232 




Idb1 


M31685 


466 


787 


721 


637 




Max 


Me3d03 


266 


224 


312 


172 




Nfatc2 


AA660093 


2313 


3218 


2396 






Pml 


U33826 


173 


281 


329 


306 




Rare 


M34476 


102 


113 


114 


2l8 




Rela 


MeidOd 


297 


260 


304 


244 




SoxIS 


W53S27 


419 


461 


484 


M7 

QOf 




Ybxl 


M62867 


643 


489 


472 


496 


2 or more, less than 3 


Zfp162 


Y12836 


671 


734 


720 














Cebpd 


X61B00 


157 


262 


168 


430 




Idb2 


M69293 


244 


210 


310 


604 




Jundl 


W29356 


1274 


2002 


1434 






Lyl1 


X576B7 


399 


342 


347 


Rai 

oy 1 




Nfe2 


L09600 


458 


743 


1042 


out> 




Nfkbl 


128117 


953 


2044 


1876 






Pbxl 


AF020196 


611 


303 


345 


212 




sfpil 


A34693 


Jro 


784 


991 


529 




Tifib 


U67303 


673 


' 659 


420 


863 




Trp53 


PI 0361 


259 


149 


125 


361 




Usf2 


U 12283 


129 


185 


285 


192 




Ybx3 


L36649 


96 


169 


210 


119 


3 or more, less than 4 


Zfp216 


AA510137 


82 


151 


204 


106 












Irfl 


M21065 


85 


207 


278 


198 




Klf2 


U25098 


62 


86 


246 


f f 




Myb 


Ml 2848 


892 


366 


230 


435 




Stat3 


AA396029 


484 


1057 


1012 


290 


4 or more, less than 5 


Tfdpl 


008639 


307 


560 


505 


1093 












Cebpb 


X62600 


390 


1248 


1380 


1903 


6 or more 


StfaU 


YD7836 


223 


383 


510 


936 




Cebpa 


M62362 


33 


212 


182 


44 




Grg 


X73369 


99 


566 


916 


1005 




Mad 


X83106 


0 


111 


167 


327 




Myc 


L00039 


314 


112 


62 


173 




Etoh{6 


W89667 


169 


366 


313 


1003 




TBX1 


AA542220 


0 


0 


1 


2 



Shown are the transcripOon factors Identified aa present by the oligonucleotide array anat/sb whoso maximal 
sets was greater than or equal to 200 U (n this study. Data are presented as described In the legend to Table 3. 
AD IndlcatBs average difference; gene symbols are expanded In an Appendix at the end of this aitlde. 



AD between perfect match and mismatch oligonucleotide 



peptides identified for a given protein were derived from regions 
along tlie entire length of the protein, indicating the observed 
products were not the result of proteolytic degradation. These 
data must be considered with several caveats: membrane and 
other hydrophobic proteins and very basic proteins are not well 
displayed by the standard 2DE approach, and proteins present at 
low levels will be missed.^* In addition, to simplify MS analysis, 
we used a Coomassie dye stain rather than silver to visualize 
proteins, and this decreased the sensitivity of detection of minor 
proteins. The MS method we used was sufficiently sensitive to 
identify proteins that could barely be visualized by colloidal 
blue staining. However, a limitation of the method for the mouse 
is that the current database lacks predicted amino acid sequences 
for a substantial fraction of murine genes. In addition, very 
small proteins give only a few peptides, making statistically 
confident identification difficult. 



Figure 5 shows the analytical colloidal blue-stained 2DE IPG 
reference maps of differentiated MPRO cells. Expression patterns 
of more than 500 protein spots were detected and observed through 
the entire series of gels. Protein spots could easily be cross- 
matched to each other, indicating the reproducibility of the method. 
As marked on the gel pictures (Figure 5), 50 proteins with a wide 
range of molecular weights (1 to 200 kd), isoelectric points (4 to 9), 
and abundances were subjected to MS protein identification. The 
results axe presented in Table 6. 

Comparing the theoretical value of the molecular weight and pi 
of each protein to that of the observed value, we confidently 
identified 28 proteins in the expected position on the gels (spots 1 to 
28). Some of the other proteins with strong matches to the murine 
databases migrated to a somewhat unexpected pi position. Nine 
spots gave clear peptide peaks on mass spectroscopy but did not 
match any known gene. Their identification will require amino acid 
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F)gur» 5. 2DE olftctrophoratograms of MPRO cells. 
MPRO ceo tysate (2.6 x 10^ oell/sample) was loaded for 
2DE analysis. Gels were stained with brttllant blue G-^l- 
toldal dye. (A) 2DE map of unlnduced MPRO cell {0 hour). 
(B) 2DE map of matured MPRO cells (72 hours). Protein 
spots nuarlced In the maps were considered dflTerenttelly 
expressed and were subjected to MS analysis. The 
resultant protein Information Is listed In Table 6. 



IPGs 




AS 5,0 S.5 6.0 6^ ?J} 7Z'U 65 




sequence analysis or availability of more extensive murine data- 
bases. We searched for the expression patterns of the genes cognate 
to the expressed proteins in dbMC (Table 6). Nineteen genes were 
found in dbMC, the mRNA for 5 genes was reported as absent, and 
13 genes were present duiing MPRO diflFerentiation. Comparison 
of the expression patterns showed only 4 genes of 1 8 present on the 
oligonucleotide chips whose expression was consistent at the RNA 
level and protein level. None of these was on the list of the genes 

J"*^'' ^' Correlation of expression patterns between mRNA level and protein level 



that were differentially expressed significantly (5-foid or greater 
change by array or 2-fold or greater change by DD). 



Discussion 

We explored the temporal patterns of gene expression during 
myeloid development. A database has been developed to provide a 



spot 



Protein definition 



Gl number 


Prediaed 
value 


Peruentage 

(%) 


2DE pattern 
Oh 72h 


cDNA expression 
pattern 


Ag 


kd 




Oh 


72 h 


2506645 


724 


5.1 




1 


3 


1321 


1043.3 ' 


N 


6752964 


41.77 


5.3 


40 


3 


6 


0 


2 


Y 


2494703 


2233 


4.9 


33 


3 


3 


341 


441.6 


Y 


7242171 


2a77 


4.7 


42 


1 . 


0 


544 


430.9 


Y 


4038346 


69.6 


7.1 


24 


2 


1 


43 


50.7 


N 


6755074 


57.9 


7.2 


46 


6 


4 


3047 


5860.3 


N 


6671609 


41.72 


6.Z 


39 


1 


3 


2539 


341.3 


N 


6679937 


36.79 


8.7 


39 


8 


7 


3073 


57423 


N 


461911 


10.99 


5,9 


46 


0 


4 


N/A 


N/A 




6660047 


35.06 


7.9 


21 


4 


2 


139 


303.1 


N 


6678413 


26.69 


6.9 


26 


3 


3 


3312 


2660,1 


Y 


1106624 


17.19 


7 


51 


2 


3 


152 


126.9 


N 


7949121 


16.59 


6.6 


25 


1 


0 


626 


8124 


N 


6681019 


6216 


6.4 


36 


2 


0 


Absent 


Absent 


N 


220474 


47.52 


6.6 


3& 


2 


0 


Absent 


Absent 


N 


62763 


35.62 


4.8 


29 


3 


0 


Absent 


Absent 


N 


5931665 


31.3 


6.7 


30 


1 


2 


Absent 


Absent 


N 


3169662 


28.6 


7.1 


39 


1 


2 


N/A 


N/A 




739346 


8.04 


6,4 


68 


0 


2 


Absent 


Absent 


N 


5932010 


158.7 


6 


17 


1 


0 


N/A 


N/A 




6756018 


7446 


7.2 


21 


1 


3 


N/A 


N/A 




3212625 


66.45 


5.7 


24 


0 


6 


N/A 


N/A 




6576815 


62.53 


7.2 


33 


2 


1 


N/A 


N/A 




1730203 


46.22 


7.2 


31 


3 


1 


N/A 


N/A 




1730519 


44.64 


8.3 


47 


5 


4 


1086 


14023 


N 


6754976 


2Z16 


&6 


53 


3 


1 


N/A 


N/A 




3913065 


17 


7.8 


65 


0 


3 


N/A 


N/A 




2137430 


12.1 


5.2 


36 


0 


1 


N/A 


N/A 





1 GRP78 

2 Actin, gamma, cytoplasmic 

3 RH06DI2 

4 Prdiferating cell nuclear antigen 

5 APS kinase 

6 Pyruvate kinase 3 

7 Mdanoma X-ectln 

6 6lyceraldehyde-3-phosphate dehydrogenase 

9 Stefin 3 

1 0 Guanine nucleotide binding protein, beta-2. 

related sequence 1 

1 1 Trtosephosphate isomerase 

12 Testls<derlved c-abi protein 

1 3 RNA binding motif protein 3 

14 CoUapsin response mediator 
16 LamlnA 

16 47-kd keratin 

17 aid478p 

18 MHC class II H2-IA-*eta^ 

19 Androg en-binding protein: subunK alpha 

20 Neuronal apoptosis Inhibitory protein 

21 PAD type IV 

22 Human serum albumin homobgoue 

23 syncrlp 

24 Transamldlnase 

25 PGK crigr phosphoglyoerate 

26 Prdireratlon-assodated gene A 

27 Putaluve peroxisomal antioxWant enzyme 
26 igE chain C2 region 



The proteins listed here are represented by the spots marked In the etectrophoretograms shown In Figure 5. 

Protein deflniUon. Gi number, end predirted value refer to the protein name, Accession number, and properties derived from the Nattonal Center for Blotechnokxiv 
Infbmnatton protein database. The column labeled % shows the percentage of peptMes predicted from the protein sequence that were detected by mass spectroscoov ^e 
expmssion level of protein spots expressed In mouse promyelocytte cell line cell Induced by all-/rans retlnote add for 0 hours and 72 hours (Figure 5) were scored onTs^le of 

the dbMC database. The genes not represented on the ollgonudeotMe arrays were marked as N/A. Ag showed the correlation of gene patterns atmRNAIevel or protein level 
Y tndk^ates agreement and N discrepancy between changes In dDNAand protein spot intensity. The numbers In bold were obtained wrtth DD. 2DE Indicates 2^1lmenalonaI 
gel electrophoresis; IgE. immunoglobulin E; DO, differential display. iiwwwiiai 
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reference for later research on the molecular mechanisms underly- 
ing noimal myeloid development. 

TIiB MPRO cell system morphologicaUy mimics nomial myeloid 
difierentiation and biochemically proceeds fbrther toward mature neutro- 
phils than most other in vitro systems. Because the arrest in differentia- 
tion of MPRO cells growing in the absence of ATRA is not physiologic, 
&ere is a theoretical risk that gene expression in these cdls is not 
coordinated in the way that it is in normal differentiation. It is 
encouraging that, for the most part, the timing of expression of genes for 
proteins of the various neutrophil granules is consistent wiA the timing 
of the morphologic and biodiemical appearance of these granule 
componeaits dming nomial myeloid differentiation. 

The DD technique provides certain advantages for detectbg 
and comparing mRNA levels in different samples. First, the method 
is, in principle, similar to competitive RT-PCR, and, with the use of 
stringent PGR conditions, is expected to be about as reliable. 
Second^ display patterns are reproducible. Third, the method 
detects the levels not only of RNAs abready represented in the 
database but also of unknown RNA species that may represent 
**new" genes. Fourth, closely related genes can be distinguished 
regardless of cross-hybridization, provided there are some single 
nucleotide differences in the 3' end sequence. Limitations associ- 
ated with this technique are that numerous gels are necessary to get 
complete information and that comparison of the levels of different 
mRNAs is only approximate because of the differential amplifica- 
tion of bands of different size or sequence. 

Oligonucleotide chip analysis is a fast and effective means of 
accessing mRNA expression patterns.^** Cluster analysis of groups 
of samples by this approach is effective. However, the present 
results indicate that alternative methods of verification are desir- 
able before the data on an unexpected change in a particular gene 
are definitively accepted. 

To obtain the broadest range of information torn tiie myeloid 
differentiation process, both diffeiential display and oligonucleotide 
chip techniques were applied in the current study. As a result, 653% of 
the observed changes m mRNA levels came from the diffbential display 
method and 41.5% came fitMn oligonucleotide chip assays. 

Our data showed in general that changes in expression pattem 
by the 2 metiiods agreed qualitatively but that there was some 
quantitative variation. Our results indicate that DD may be a more 
accurate way to detect changes in levels of gene expression than tiie 
oligonucleotide chip assay. However, improvements in tiie types of 
oligonucleotides used in arrays may close this gap in the future. 

The mRNAs for a limited number of transcription fectors vary in a 
pattem correlating witii tiiat of the mRNAs for primaiy or seccmdary 
granule proteins. However, more detailed information is needed, and Ae 
untying mechanisms of granule gene regulation remain unclear. The 
number of potential positive and negative regulatory factors found here 
is sufiicienliy small as to make it feasible to perfomi in vivo studies, 
such as diromatinimmunoprecipitation. 

The oligonucleotide chip used in tiiis study focused on known 
genes, whereas the DD metiiod samples all polyadenylated tran- 
scripts. The latter metiiod generated a large number of products not 
associated witii known genes, in pan because the mouse genome is 
not as well represented in tiie database as tiie human genome. 
However, our experience witii DD and human mRNAs indicates 
tiiat substantial fractions of the products represented as ESTs or not 
represented at all in the public databases are cDNA copies from 
introns. hnRNA. or oti>er RNA with internal A runs. 

Approximately 59 sequences obtained from gel-display bands 
had significant changes in tiie level of expression and a sequence 
tiiat did not match that for any named gene in the public databases. 
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Of tiiese. 38 had plausible or excellent polyA signals. This is only 
an approximate estimate of tiie number of new genes found^ 
because a fraction of tiie mRNAs for known genes stiU had poor 
polyA signals. In addition, tiie Ml 3' untranslated region is often 
not known for characterized genes, and in some cases tiiese new 
genes may prove to be identical to products identified by tiie 
oligonucleotide chips when more complete sequences are obtained. 
At tiie least, their presence indicates tiiat a substantial fraction of 
the regulatory or functional circuitry of maturing myeloid cells 
remains unexplored and tiiat valuable tools for tiieir investigation 
will emerge from a combination of RNA expression studies and 
analysis of emerging genouuc sequences. ' 

The desired end point for tiie description of gene expression in a 
biologic system is not only tiie analysis of mRNA transcript levels 
but also tiie accurate measurement of protein abundance. Tlie 
developments in 2DE and new MS instrumentation make it 
possible to accompUsh tiiis woric rapidly and efficientiy. In tiiis 
study, we attempted to identify a number of tiie proteins differen- 
tially expressed between uninduced and ATRA-differentiated MPRO 
cells and to examine tiie relation between mRNA and protem expression 
levels for these genes representing the same state. 

For protein levels based on estimated intensity of Coomassie dye 
staining in 2DE, tiiere was poor correlation between changes in mRNA 
levels and estimated pmtein levels. Other groups have studied tiie 
conelation between mRNA and protein levels in yeast and Uver 
cells."''2,i4 In tiie liver ceU experiments,'^'^ correlation coeflidents of 
0.4 to less tiian 0.5 were observed, hi an extensive study myeast,"*'^ tiie 
conelation coefficient was high if tiie most abundant mRNAs and 
protems were considered. If a handful of tiiese products was omitted, tiie 
remaming correlation coefficient was 0.4 or less. However, one 
could restore some of the correlation by averaging individual 
data points into broad proteomic categories.^' 

The discrepancies between mRNAand protein levels in MPRO cells 
appear to be substantially larger tiian tiiose observed for y east Possible 
causes for tiie discrepancies include translational regulation, differential 
expression of certain mRNAs at various stages of cell growtii in vitro, 
post-translational protein modification tiiat varies whh the stage of 
maturation of tiie cells, and selective degradation or excretion of proteins 
in vivo. Furthermore, here we are focusfaig on a developmental 
time-course, \\toBas tiie yeast study concentrated on the organism in 
vegetative growtiL New techniques, equipment, and bioinfonnatic 
analysis tools must be developed to make such systematic, global, and 
quantitative analyses feasible. 

The initial studies of protein expression presented here provide a 
cautionary note for efforts to interpret cell composition and fimction m 
relation to mRNA levels. Discrepancies we observed between gene 
expression and protein abundance suggest tiiat selective post-transcrip- 
tional controls may be at least as important as changes in mRNA levels 
in determining tiie protein composition of neutrophils and tiiat tiiey are 
phenomena less well explored than transcriptional controL Analysis of 
mRNA expression patterns is itself only a small beginning towanl a 
genome-wide description of cellular components. 
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Appendix 



Gene symbols tised io tables; Actb: actin, beta, cytoplasmic; Actg: actio, gamma, 
cytoplasntuc; Acta.: melanoma X-actin; Aldol: aldolase 1» A isoform; AffS: 
ADP-ribosylation fector 5; Atfl ; activating transcription fectotr 1; Atf2: activating 
transcription fector 2; BID: basic transcription fector 3a; Bzrp: peripheiat-type 
benzodiazepine receptor; C5rl : complement component 5, receptor 1/G protein- 
coupled receptor (C5a); Ccnb2: cyclin B2; Cd36i2: CD36 antigen (collagen type 
1 receptor, thrombospoadin receptQr)-likfi 2; Cd53: CD53 antigen; Cebpa: 
CCAAT/enhancer binding protein C/EBP, alpha; Cebpb: CCAAT/enhancer 
binding protein (C/EBP), beta; Cebpd: CCAAT/enhancer binding protein (C/ 
EBP), delta; Cebpe: CCAAT/enhancer binding pirotein (C/EBP), epsilon; Cfli; 
cofiUn 1 , nonmuscle; Cmkar4: chemokine (C-X-C) receptor 4; Cmkhrl : chemo- 
kine (C-C) receptor 1/Mipla receptor; Cn^: cathclin-Ukc protein; Cntfi ciliary 
neurotropic factor/zinc finger protein PZF; Copa: coatomer protein complex 
subunit alpha; Cpa3: carboxy peptidase A3, mast cell; Ct2; complement receptor 
2; Crtir. corticotropin releasing hoimorje receptor Cffy; complement receptor- 
related protein; Csflr. CSF 1 (M-CSF) receptor/c-fins/CDJ 15; Cs£2ni: CSF 2 
(OM-CSF) receptor, alpiw. bw-affinity/CD116; CsfZrbl: CSF 2 (GM-CSF) 
receptor, beta 2, low-affinity/IL 3 recepto^likfi protein (AIC2ByCDwl31; 



C9f2rb2: CSF*2 (GM-CSF) receptor, beta 2, io w-affinity/lL-3 receptor (AIC2A); 
Ctsb: catfaepsin B; Ctsc: cathepsin C; Ctsd: cathepsin D; Qse: catiiepsin E; Qsg: 
cathepsin O; Ctsh: cathepsin H; Ctid: cathepsin L; Ctss: cathepsin S; Cybb: 
cytochrome b-245, beta; Drd2: dopamine receptor 2; E2fl: E2F transcription 
fector 1; Eai2: eosinophil-associated ribomiclease 2; EbD: Epstein-Barr vims- 
induced gene 3/cytQkinfi receptoi^like molecule (EBD); E12: Balb/c neutrophil 
elastase; Ela2; clastase 2; Erh: enhancer of nidimentary homoiog (Drosophila); 
Etohi6: ethanol induced 6/sterol rcgulatoiy element binding transcription fector 1 
(SREBFl) homoiog; F2rl2: coagulaUon factor U (thrombin) receptor-like 2; 
Fcerlg: Fc receptor, IgE, high affinity I, gamma polypeptide; Fcgr2b: Fc receptor, 
IgG, low aflEinity lib; Fcgf3: Fc receptor, IgG, low affinity IH; Fprl; fcnnyl 
peptide receptor 1/fMLP receptor; Gabpbl : GA repeat binding protein (GABP- 
betal siibunit); Gata2: OATA-binding protein 2; Gnas: guanine nucleotide 
binding protein, alpha stimulating; Onb2-rsl : guanine nucleotide binding protein, 
beta-2, related sequence 1; Gpx3: ghitathionc peroxidase 3; Qrg: «lated to 
Drosophila groucho gene; Oridl: glutamate receptor chamiel siibunit delta 1; 
Qm: granulin; Gstml: ghitathtone-S-transferese, mu 1; Gus-s: bcta-gliictironi- 
dase structural; Gys3: glycogen synthase 3, brain; H2-D: histoconipatibiiity 2, D 
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region bcus I; HistZ: histone geae complex 2; Hist5-2ax.: H2A histanc femily, 
member X; Hmgt high mobility group proteia I; Hsp60: heat shock protein. 60 
W)a; HtrSa: 5-hydioxytryptamine (serotonin) receptor 5A; Idbl: inhibitor of 
DMA binding l/heiiA.-loop-helix DNA binding protein regulator (Id); Idbl 
inhibitor of DNA binding 2; Ifhgr interferon gamma receptor; Ifiigr2: interferon 
gamma receptor 2; li: la-associated invariant chain; Ula: ILl alpha; Illr2: ILI 
receptor, type H; I12rg: IL2 receptor, gamma chain; IMra: XLA receptor, al|^; 
IllQrb: ILIO receptor, beta; I117r ILI7 receptor; Irfl: interfoonitgulatoiy fector 
I; Irf2:interfatmreguJatoiy fcctor.2;Itgb2:inlegrin beta2(Cdl8);It^^ inositol 
1,4,5-trisphosphate receptor (type 2); Jundl: Jun proto-oncogene-related gene 
dl/transcription fector JUN-D; KiG: KiuppeHike factor LKLF; L-CCR: lipopoly- 
sacchaiide inducible C-C chemoWne receptor-related; Lcfl2: lipocalin 2; Ldir 
low density lipopirotein receptor, Lspl: Lymphocyte-specific l/S37/pp52; Lstl: 
leucocyte-specific transcript 1 ; Ltb4r leukotriene B4 receptor; Ltbr lymphotoxin- 
beta receptor; Ltf: lactotransfeirin; Ly64: lymphocyte antigea 64; Ly6c: lympho- 
cyte antigen 6 complex, locus E; Lyll : iymphoblastomic leukemia/bHLH factor; 
Lyzs: lysozyme; M6pr mannose-6-phosphate receptor, cation dependent; Mad: 
Max dimerization protein; Man2cl: mannoaidase, alpha, dass 2C. member 1; 
Max: Max protein; Mar MVC-associated zinc finger protein (purine-binding 
transcription fector); MBP: eosinophil granule major basic protcm precursor; 
Mcptg; mast cell protease 8; MU: myeloid/lymphoid or mixed-lineage leukemia; 
Mmpl3: matrix metaUoproteinase 1 3/collagenase; Mmp9: matrix mctalloprotein- 
ase 9/geJatina9e B; Mpo: myeloperoxidase; Myb: myeloblastosis oncogene; 
Mybl2: myeloblastosis oncogenc-like 2; Myc: myelocytomatosis oncogene; 
Myhi: myosin light chain, alkali, nonmusde; Nfetc2: nuclear fector of activated T 
cells, cytoplasmk 2; Nfe2: miclear fector, cfythroid-derivcd 2. 45 kDa; Nfltbl; 
NF-kappa-B (pi 05); Ngp: neutrophilic granule protein; NMDRGB; N-methyl-D- 
aspartate receptor glutamatc-binding chain homolog; Npml : nucleophosmin I; 
Nr4al: nuclear receptor siibfrmay 4, group A, member 1; Osi: oxidative stress 
induced; P2ixl: purinergic receptor P2X, ligand-gated ion channel. 1; P2ry2: 
purincrgic receptor P2Y, G-protein-^x)upled 2; P40-8: P40-8, functtonal/laminin 
receptor; Pbxl: B-cell leukemia transcription fector 1; Pfb: properdin fector, 
complement; Piral: paired-Ig-like receptor Al; Pira5: paircd-Ig-like leceptor 
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A5; Pira6: paired-Ig-Uke receptor A6; Pirb: paired-Ig-like receptor B; Plaiir 
urokinase plasminogen activator receptor; PMI: putative receptor protein (SP: 
PI71^?2 ); Pml: promyetocytic leukemia; Prg: proteoglycan, secietoiy granule; 
Pig3: proteoglycan S/eosinophil major basic protein 2; Prtn3: proteinase 3; 
Psma2: proteasome (prosome. macropain) subunit^ alpha type 2; Ptmb4: prothy^ 
mosin beta 4; Ptprc: protein tyrosine phosphatase, receptor type, C; Rflc2: 
RAS-related C3 bohilinum substrate 2; Raig: retinoic acid receptor, gamma; 
Rela: avian icticutoendothelbsis viral (v-rel) oncogene homolog A/NF-kappa-B 
p65; Rpll9: ribosomal protein LI9; RPL8: ribosomal protein L8; Rp36kiil: 
ribosomal protein S6 kinase pofypeptide 1; Rps8: ribosomal protein S8; Rtn3: 
leticuton 3; S100a8: SI 00 cafchim binding protein A8 (calgranulinA); SI00a9: 
SlOO cafcium-binding protein A9 (calgranwUn B); Sdft2: stromal ccU-^dcrived 
fector receptor 2; Sell: selectin L (lymphocyte adhesion molecule I); Sema4d: 
semaphorin 4D; Seppl: selenoprotein P, plasma, 1; Si^W: SFFV proviral 
itttegratbn 1; Sh%l: split hand/foot deleted gene I; SlclOal: solute earner 
femHy 10 (sodium/bile acid cotransporter femily). member .1; SJpi: secretory 
leukocyte protease inhibitor; Sox 1 5: SRY-box containing gene 1 5; Spi2- 1 : serine 
protcase inhihitor 2-1 ; Srbl : scavenger receptor class B I ; Stat3: signal transducer 
and activator of transcription 3; StatSa; signal transducer and activator of 
transcription 5A; Stat6: signal transducer and activator of transcription 6; SbTil4: 
basic-helix-bop-helix protein«tinoic acid induced; Tbxl: TBXl protein/LPS- 
itJthiced TKF-fllpha fector homolpg; Tcrgb: T-cell-itceptor gcrmiine beta-chain 
gene constant region; Tcig-V4: T-<:ell-feceptQr gamma, variable 4; Tctexl: 
t-complex testis expressed 1; Tfdpl: transcription fector Dp 1; Tiflb: trtmscrip- 
tional intermediaty fector I, beta; Tlr4: toll-like receptor 4; Tnfisfla: TNF 
receptor superfemily. member la; Tnfi^tlb: TNF supeifemUy. member lb; 
TQmm70a: translocase of outer mitochondrial membrane 70 (yeast) homolog A; 
Tpi: trioscphosphttte isometase; Trp53: transfonnation-relatcd protein 53; Ubb: 
ubkiuitin B; Us£2; upstream transcription fector 2; Ybxl: Y box transcription 
fector; Ybx3: Y box binding protfein; Zip 1 1-6: zinc finger protein si 1-6; Z^l 8: 
zinc finger protein 18 homok^ 2fp36: zinc finger protein 36; Zfjjl 62: zinc finger 
protein 162; Zfp2l6: zinc finger pirotcin 216; Z%nl: zbc finger protein, 
multitype 1; Zn&lal : zinc finger protein, subfemily I A, I (Ikaroa); Zyx: zyxin. 
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Bacteria] lipopolysaccharide (LPS) evokes several 
functional responses in the neutrophil that contribute 
to innate immunity. Although certain responses, such as 
adhesion and synthesis of tumor necrosis factor-o, are 
inhibited by pretreatment with an inhibitor of p38 mi- 
togen-activated protein kinase, others, such as actin as- 
sembly, are unaffected. The aim of the present study was 
to investigate the changes in neutrophil gene transcrip- 
tion and protein expression following lipopolysaccha- 
ride exposure and to establish their dependence on p88 
signaling, Microarray analysis indicated expression of 
13% of the 7070 Af^^metrix gene set in nonstimulated 
neutrophils, and LPS up-regulation of 100 distinct 
genes, including ^^tokines and chemokines, signaling 
molecules, and regulators of transcription. Proteomic 
analysis yielded a separate list of up-regulated modula- 
tors, of inflammation, signaling molecules, and cytoskel- 
etal proteins. Poor concordance between mHNA tran- 
script and protein expression changes was noted* 
Pretreatment with the p38 inhibitor SB203580 attenu- 
ated 23% of LPS-reguIated genes and 18% of LPS-regu- 
lated proteins by 15:40%. This study indicates that p38 
plays a selective role in regulation of neutrophil tran- 
scripts and proteins following lipopolysaccharide expo- 
sure, clarifies that several of the effects of lipopolysac- 
charide are post-transcriptional and post-translational, 
and identifies several proteins not previously reported 
to be involved in the innate immune response, 

Lipopolysaccharide (LPS),^ a component of the outer cell wall 
of Gram-negative bacteria, evokes a variety of functional re- 
sponses in the human neutrophil (PMN) after bmding to a 
plasma membrane receptor complex that involves the ToU-like 
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receptors (TLRs) (1-5). These "immediate" functional re- 
sponses, including actin assembly, adhesion, activation of nu- 
clear factor-kappa B (NP-kB), and priming for an enhanced 
secretory response and for release of reactive oxygen interme- 
diates, appear to be central both to the innate immune re- 
sponse and to the pathogenesis of several inflammatory human 
diseases, including sepsis and tiie acute respiratory distress 
^^^^'^^ mitogen-activated protein kinase (p38 

MAPK) has been shown to mediate LPS-induced PMN adhe- 
sion, NF-kB activation, and TNP-a and 11^8 ti^nslation and 
release (7), and its blockade attenuates LPS-induced PMN 
accumulation in the airspace (8). However, other cascades al- 
most certainly lead to downstream effectors of the LPS signal; 
for example, actin assembly appears to be p38 MAPK-inde^ 
pendent (9). An improved understanding of the transcriptional 
and trandational responses of the neutrophil to LPS and the 
modulation of tiiese responses by p38 MAPK might carry 
pathogenetic and therapeutic implications. 

Historically, it has been believed that tiie downstream PMN 
transcriptional response to LPS is static and that PMN func- 
tional responses to LPS that depend on de novo protein syn- 
tfxeais are primarily liinited to tiie release of cytokines (10) 
However, recent studies indicate a robust ta^criptional re- 
sponse (11). To date, most studies have relied upon and re- 
ported a short list of functional assays of the LPS-exposed 
PMN; therefore, no exhaustive investigation of either the tran- 
scriptional response or protein synthetic repertoire of the PMN 
has been reported. Although several techniques have been used 
to evaluate transcripts, the screening of global changes in 
mRNAby microarray analysis has only recently become possi- 
ble. In this way, thousands of genes can be screened in an 
unbiased fashion for transcript abundance. Such genomic 
screens in maromaUan cells have previously been applied to 
define altered expression profiles in response to agonists (12) 
and to drug action (13) and during cell cycle progression (14). 

Although DNA microarray technology is expected to provide 
insight into the response of the human PMN to LPS (15) 
inhibition of LPS-stimulated IL-1 and TNP-a production by 
p38 MAPK inhibitors in THP-1 cells (16) and of TNP-a synthe- 
sis in human PMNs (9) octurs at a translational level and 
would therefore not be detected by DNA microarrays. Further- 
more, in other systems, such as yeast and human liver, mRNA 
and protein levels show poor correlation (17, 18). Proteomics is 
a complementary tool for assessing global changes in cellular 
protein expression, ttiereby providing additional insight into 
cellular signal regulation. A proteomic approach has proven 
useful in different systems for dissecting signal ti-ansduction 
cascades and describing their output (19, 20) and has even 
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recently been used to detect novel upstream messengers in- 
volved in LPS signal transduction (21). We have applied DNA 
microarrays and proteomics to define and compare transcrip- 
tional and postrtranscriptional alterations in the LPS-exposed 
PMN and to establish the dependence of these alterations on 
p38. MAPK signaling. 

EXPERIMENTAL PROCEDURES 

iifatena/fr— Endotoxin-free reagents and plastics were used in all 
experiments. Aprotinin, leupeptin, AEBSF, E-64, pepstatin, and befita- 
tin protease inhibitors, spermine HCl, and ce-cyano-4-hydroi87cinnamlc 
add (CHCA) were all purchased from Sigma Chemical Co. (St. Louis. 
MO). SB20358O, a p38 MAPK inhibitor, was purchased from Calbio* 
chem-Novabiochem Corp. (San Diego, CA). For two-dimensional PAOE, 
rehydration buffer, equilibration buffers, vertical electrophoresis solu- 
tions, and 105b homogeneous polyacryLamide slab gels were purchased 
from Oenomic Solutions, Inc. (OSI, Ann Arbor, MI). Sequencing grade 
porcine trypsin was purchased from Promega (Madison, WI). 

LPS /TicuWiora^PMNs were isolated by the plasma PercoU method 
(22), a technique that yields less than 6% mono(^c contamination, and 
resuapended at a concentration of 16.4 X 10**/ml in RPMI 1640 culture 
medium (BioWhittaker, WalkersviUe, MD) supplemented with 10 mM 
HEPES (pH 7.6) and 19b heatrinactivated platelet-poor plasma. After 
addition of 100 ng^ml EBcherichia coli 0111:B4 LPS (List BiologicaD, 
incubation waa carried out with continuous rotation (4 h, 37 "C) both in 
the presence and absence of SB203680. Both Aflymetrix analysis and 
proteomic analysis utilized 75 X 10* cells* For microarray analysis, 
nonstimulated and 4-h-treated PMNs were collected ftx>m three sepa- 
rate donors. A more detailed time course following LPS exposure was 
performed using polymerase chain reaction. For proteomic analysis, 
LPS incubations from separate donors (n = 6) were performed and then 
analyzed individually. Control and post^LPS incubation PMNs were 
washed (0.34 M sucroae/1 mM EDTA/10 mM Tris) and then lysed in a 
modified rehydration buffer (GSI, Ann Arbor, MI) supplemented with 2 
M thiourea, 50 mM dithiothreitol (DTT), 22,5 mM spermine HCl, and a 
mixture of six protease inhibitors (10 Mg^l aprotinin, 10 leupep-. 
Un, 2 mM AEBSP, 5 ptM E-64, 1 pepstatin, 10 bestatin). DKA waa 
pelleted by centrifugation at 260,000 X g for 60 min (23). 

AflymetHx OligonucUotide Amay— Five micrograms of total RNA was 
isolated with TRIzol anvitrogen) and RNeasy columns (Qiagen) and 
subsequently labeled with biotlh as described by Ai^^etrix. Biiefly, 
firat^trand synthesis was accomplished with Superscript II rwerse tran- 
scriptase (Invitrogen) using a T7-oligo(dT)ft4 primer for 1 h at 42 'C 
followed by seoond^strand synthesis using E. ccli DNA polymerase I and 
RNase H (Invitrogen) at 16 *C for 2 k Double-stranded DNA was used as 
a template for in vitro transcription with T7 RNA polymerase in the 
presence of biotin-labeled UTP and CTP using the BioArray High Yield 
RNA transcript labeling kit (Enzo). Fifteen micrograms of cRNA was 
fragmented and used for hybridization to Aifymetrix HuGene 6800FL 
Qenechipa. Each sample was hybridized initiaUy using a Test2 Oenechip 
to test for sample degradation and full-length iii vitro tranalatioa Data 
were analyzed using Affymetrix Oenechip software. Results from three 
separate donors were analyzed. 

Reverse Transcription and Polymeraae Chain Reaction — cDNA was 
prepared by reverse transcription using 2 ftg total RNA, derived from 
20 X 10* cells that were treated as indicated* Polymerase chain reac- 
tions were performed using specific primers for Afx-J, INF-a, MCP-l, 
pSS, S100A4, and glyceraldelvyd&'S-phosphate dehydrogenase. 

Two-dimensional PAO£!— The protein concentration of the lysates 
was measured as described by Bradford et al. (24). Poor isoelectric 
focusing (lEF) results were encountered unless the polycationic sperm- 
ine was diluted (data not shown); therefore, lysates were diluted with 
rehydration buffer {QQl, Ann Arbor, MI) to achieve a final spermine 
concentraUon of 6 mM. Equal protein loads (1.5 mg) of control and 
LPS-stimulated neutrophils were used to rehydrate lEP gels overnight 
(18 cm, pH 3-10 nonlinear Immobiline DryStrip lEP gels, Amersham 
Biosciences; Piscataway, NJ). lEP was performed at 20 "C to 100-kVh 
(Phaser, OSI) under mineral oil, followed by two 10-min SDS equilibra- 
tion steps (DTT and then iodoacetamide-containing equilibration buff- 
ers, GSI) and then by vertical electrophoresis on 10% homogeneous 
polyacrylamide slab gels (OSI) at 600 V. Protein spots were visualized 
by agitation in colloidal Coomassie Brilliant Blue O-250 (16 h) (25), 
followed by deataining in deionized Water (20 h). In separate experi- 
ments, contrx)! and LPS-stimulated PMN lysates from three donors 
were pooled and then analyzed by two-dimensional PAGE using over- 
lapping narrow isoelectric point (pi) ranges (18 cm, pH 5.0-6.0. 5.6- 



6,7, and 6-11, Amersham Biosciences, Piscataway, Nj). Identical IBP 
and vertical electrophoresis parameters were used for all gels. 

Image Analysis of Two-dimensional CJeZs— Colloidal Coomasaie- 
stained gels were digitized using a Powerlook 11 (UMAX. Data Systems, 
Inc., Taiwan) flatbed scanner with 8-bit dynamic range and 150-dpi 
resolution. Biolmage (OSI, Ann Arbor, MI) 2D-Analyaer software was 
used to locate, quantitate, and match protein spots on the control and 
LPS gjel images. Analysis was performed by assigning 50 common 
anchor spots between paired images; the remaining spots were com- 
pared by a constellation'matching algorithm. All data were then care- 
fully reviewed by the operator to account for any discrepancies. Protein 
loading between control and experimental gels may have varied be- 
cause of inconsistencies in rehydration of the different DSP gel strips; 
therefore, gel images were normalized so that the sum of the integrated 
intensities of all matched spots on paired gels was made equal. Control 
and LPS-stimulated gel images from individual donor experiments 
were matched to generate composite images; composite images were 
then matched into a master composite image to track the LPS response 
of protein spots among different donors (26). Only those spots that were 
common (image-matched) to all original 12 (pH 3.0-10.0) gels were 
considered for fUrther analysis. For these spots, the LPS-induced 
change in integrated intensity in the six experiments Was subjected to 
statistical analysis with a two-tailed Student*s t test, and those spots 
wlthp < 0.05 were identified by peptide mass fmgerprinting (described 
below). For the narrow range (pH 5:0-6.0, 5.5-6.7, and 6-11) two- 
dimensional PAOE experiments using pooled donors, only those spots 
with concordant regulation exceeding 1.5-fold or that appear^ de novo 
in the LPS gel in two repeat experiments were further analyzed, 

In-gel Tryptic Digestion^ln-geX digestion of protein spots was per- 
formed with sequencing grade porcine-modi&ed ttypsin using the 
method of Helbnah et al, (27). Tryptic peptides were then extracted (50 
/id of 60% acetomtrile/6% triftuoroacetic add, 2 h), and the supernatant 
was taken to dryness in a vacuum centrifuge and then redissolved in 
trifluoroacetic acid (20 ^1, 0.5%). Peptides were then purified and con- 
centrated using ZipTipoig pipette tips (Millipore, Bedford, MA). 

MALDI'TOF Mass Spectrometry— Analyses were perfbrmed on an 
^plied Biosyatems matrix-assisted laser desorption ionization time-of- 
fUght (MALDI-TOF) Voyager-DE PRO mass spectrometer (Praming- 
ham, MA) operated in delayed extraction mode. Samples (0.5 /4) were 
spotted onto a sample plate to which matrix (0.6 ^iX of 10 mg/ml CHCA) 
was added. The sample-matrix mixture was dried at room temperature 
and then analyzed in reftector mode. CHCA was also spotted alone as a 
negative control. Spectra were the sum of 100 laser shots, and those 
peaks with a signal-to-noise ratio of greater than 3:1 were selected for 
data base searching. Spectra were internally calibrated using autolytic 
ttypsin peptides {m/z 842.61, 2211.10). 

Data Base Searching Algorithmr— The monoisotopic masses for each 
protonated peptide were: (a) entered into the program MS-Pit (available 
at prospector,ucsf.edu) for searches against the Swiss-Prot, NCBI, 
and QenPept databases, and (6) entered into Mascot (available at 
matrixscience.com), an algorithm testing statistical significance of pep- 
tide mass fingerprinting identifications. For MS-Pit searehes, masses 
derived from trypsin, CHCA, keratin, and Coomassie Brilliant Blue 
O-260 were excluded. Search parameters included a maximum allowed 
peptide mass error of 0.1 Da (0.8 Da in the few instances in which linear 
mode was used), consideration of one incomplete cleavage per peptide, 
pi range of 3.0 -10.0, and molecular mass range of 1-200 kDa. Accepted 
modifications included carbamidomethylation of cysteine residues 
(from iodoacetamide exposure following lEF) (28) and methionine oxi- 
dation, a common modiftcation occurring during SDS-PAGE (29). Pro- 
tein identifications were assigned when three criteria were met; 1) 
statistical signiacance (p < 0,06) of the match when tested by Mascot 
(matrizscience.com); 2) >20% sequence coverage by the tryptic pep- 
tides; and 3) concordance (±15%) with the molecular weight and pi of 
the parent two-dimensional PAOE protein spot. The following special 
exceptions were considered: (a) protein identifications not fulfilling 
criterion 2 were stdl assigned if criteria 1 and 3 were fulfdled and no 
other Homo sapiens proteins with peptide mass-matched p values < 
0.05 were identified Mascot; (6) if criterion 3 waa not fulfilled (lower 
than expected molecular weight), a cleavage product of the identified 
protein was inferred, and the cumulative molecular weight of the tryptic 
peptides was compared with that of the two-dimensional-PAOE spot to 
ensure Oiat it was not exceeded; (c) if criterion 3 was not fulfilled (isolated 
discordance between theoretical and observed pi), post-translational mod* 
ificatbn of an unrecovered peptide waa inferred; and id) if two or more H. 
sapiens protein assignments with >4 mutually exclusive matching pep- 
tides were identified, a protein mixture in the two^limensional PAOE 
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spot was inferred and further analysis halt^ (quantitative conclualons 
regarding the individual protein constituents could not be drawn). 

RESULTS 

Genes Differentially Expressed in LPS-stimulated Neutro- 
pAiZa— Human PMNs were left untreated or incubated in the 
presence of 100 ng/ml LPS for 4 h. Aa a control to confirm that 
the PMNe were quiescent at baseline and that LPS resulted in 
normal stimulation, mRNA was isolated, cDNA was prepared, 
and PGR for TNF-a was performed. Little TNF-a expression 
was seen in aonstimulated cells, whereas LPS treatment led to 
an increase in expression in each of the donors subsequently 
used for microarray analysis (data not shown). No macrophage- 
colony stimulating factor receptor transcript was detected by 
oligonucleotide microarray analysis, confirming there was no 
significant monocjrtic contamination. 

Human PMNs express a limited repertoire of mRNA tran- 
scripts at baseline but respond to LPS with differential expres- 
sion of genes in many families. Considering only those genes 
present by microarray analysis in all three donors, imsdmu- 
lated PMNs expressed 13.0% (923 of 7070 genes) of the Af- 
fymetrix gene set. Gene classes represented at baseline include 
metabolic enzymes, structural proteins, receptors, signaling 
proteins, and transcription factors. By comparison, human 
monocytes expressed -40% and human fibroblasts -35% of the 
represented genes (data not shown). By the criterion of a >3- 
fold. increase in expression in all three donors on Affymetrix 
oligonucleotide array analysis, exposure of PMNs to LPS for 4 h 
resulted in the up-regulation of 100 genes (Table I). 

Genes from several different functional classes were induced 
in PMNs following LPS exposure.. Of interest, a number of 
transcriptional regulators were induced, including transcrip- 
tion factors of . the NP-kB family. The transcriptional NF-kB 
complex has previously been implicated in the regulation of the 
genes induced by LPS (11). The genes for several cytokines and 
chemokines were also found to be up-regulated. These include 
TNF-a, IL-ip, MCP'l, MJP-Sa, and MlP-10 (Table I). 
PGR was performed to confirm the results from the microarray 
analysis. PCJR analysis on selected genes .indicates that the 
time course for changes can be rapid or delayed but parallel the 
changes found in the array at the 4-h time point (data not 
shown). Other up-regulated genes included those for metabolic 
enzymes, immune response molecules, kinases, phosphatases, 
signaling molecules, adhesion and cytoskdetal components, 
interferon-stimulated genes, and those with unknown or mis- 
cellaneous function (Table I). 

LPs stimulation of PMN also resulted in the down-regula- 
tion of 56 genes (Table 11). Down-regulated genes were identi- 
fied as transcriptional regulators, protein and lipid kinases and 
phosphatases, structural molecules, and signaling molecules. 
Genes for metabolic proteins were also evident, as were several 
uncharacterized genes. 

Two-dimensional PAGE and Image Analysis^Jn contrast to 
the limited number of transcripts found at baseline, PMNs 
were found to express a large number and variety of proteins in 
the nonstimulated state (Pig. 1, A and C. and Tables III-V). 
Reproducible protein expression patterns were found on the pH 
3.0-10.0 geJs, and the m^ority of proteins fell in the pH 5.0- 
7.0 range (Pig. lA). The basic region (pH > 7.0) consistently 
exhibited poor resolution, precluding meaningful unage analy- 
sis and fiirther workup (data not shown). Depending on the 
spot-finding parameters (minimum spot intensity, filter widUi) 
selected on the image analysis software, spot-by-spot manual 
editing was found to be necessary to avoid over- and underde- 
tected spots; moreover, further manual editing was performed 
to screen for unmatched and mismatched spots following 
matching of paired control and LPS-stimulated gels. After spot 
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editing, -1200 weU^resolved spots were evident on each pH 
3.0-10.0 gel. In an attempt to improve resolution of the pi 
range bearing the greatest number of well-resolved spots, over- 
lapping narrow pH range gels (pH 5.0-6.0, 5,5-6.7, 6-11) were 
also run. Of interest, a similar number of well-resoWed spota 
(-1200) were detected on the narrow pH range gels (Fig. 1, C 
and D). Assuming a detection limit for Coomassie of 15 ng (0.26 
pmol, or 1.5 X 10" molecules, for a 60-kDa protein) and a 
protein load per gel corresponding to 75 X 10^ PMNe, we 
estimate a detection limit on our gels of 2000 moleculea/cell for 
a 60-kDa protein. As investigators have suggested in other cell 
lines with the use of high resolution two-dimensional-PAGB 
methods (30), we estimate that > 10,000 proteins are expressed 
in the resting PMN. 

Human PMNs respond to LPS with the differential expres- 
sion of a large number of proteins. In the six individual pH 
3.0-10.0 experiments, the number of protein spots that in- 
creased in integrated intensity by at least 50% following LPS 
exposure was 185, 122, 104, 104, 96, and 131, respectively. The 
number of protein spots that decreased by at least 60% follow- 
ing LPS exposure was 72, 151, 102, 98, 128. and 97, respec- 
tively. Although gel-to-gel regional variability in resolution was 
expected to account for individual spots not being well visual- 
ized on particular gels, only those spots that were matched to 
all 12 original gels were analyzed further. Overall, the number 
of spots matched to all 12 original gels was 125. The numbers 
of spots that were both matched to all 12 original gels and that 
increased by at least 60% in integrated intensity in the indi- 
vidual experiments following LPS exposure were 46, 13, 17, 27, 
22, and 20, respectively. The numbers of spots that were 
matched to all 12 gels and that decreased by at least 60% were 
6, 22, 17, 22, 34, and 28, respectively. The LPS-induced change 
in integrated intensity of the 125 spots that were matched to all 
12 original gels was subjected to statistical analysis with a 
two-tailed Student's t test, and those spots with statistically 
significant (p < 0.05) regulation among the six experiments 
were identified by peptide mass fingerprintmg (Table IH). 

Identification of LP8-regulated Proieifis— Several proteins 
were consistently up-regulated on the pH 3.0-10.0 gels (Table 
m). including regulators of inflammation (annexin IH) and 
signaling molecules (Rab-GDP dissociation inhibitor jS). Sev- 
eral actin fi-agments were seen to be consistently up-regulated 
in the six experiments following LPS exposure (Table IH). Of 
interest, the proteasome 0 chaiii was also consistently up- 
regulated. Down-regulated proteins included other signaling 
molecules, such as Rho GTPase activating protein 1. 

On the pH 5.0-6.0 and 5.5-6.7 gels, several proteins were 
found to show increases of greater than 1.5-fold following LPS 
exposure (Tables IV and V), including cytoskeletal proteins, 
such as moesin, nonmuscle myosin heavy chain, and a putative 
phosphorylated form of nonmuscle myosin heavy chain, and 
signaling molecules, such as protein phosphatase 1 and PO4- 
stathmin. The putative phosphorylated form of nonmuscle my. 
osin heavy chain {spot UlOl) was positioned 0.03 pH Unit more 
acidic than the unmodified protein (spot 01202) (Pig. ID) and 
was distinguished by a tryptic peptide (mJz 1366.74) not pres- 
ent in the unmodified protein, consistent with phosphorylation 
of serine 685. Serine .685 is predicted by NetPhos 2.0 Prediction 
Server (available at www.cbs.dtu.dk/services/NetPhos/OD) to 
be a high probability phosphorylation residue and by Scan- 
Prosite (www.expasy.ch/tools/scnpsite.html) to be a substrate 
for protein kinase C. The tryptic phosphopeptide identified in 
PO^-stathmin, extending from residues 15 to 27 (1468.7 Da), is 
consistent with phosphorylation of either serine 16, a known 
substrate for Ca'^^/calmodulin (CaM)-dependent kinases (32), 
or serine 25, . a known substrate for p38$ and ERK (Pig. 2A) 
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Description 
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Table I 

Human neutrophil gene$ induced after 4 h of LPS exposure 



OonBank^ 



Ohanffd-fbld 



Transcriptional regulation 
Pleiomorphic adenoma gene-like 2 
NFKB2 
NFKBIE 
p66 
BCL3 

X'box binding protein I 

Metal-regulatory transcription factor 1 

Et8-2 

c-Rel 

NPKBl 

Ba$ic leucine zipper transcription factor, ATP-like 
1KB 

MAX dimerization protein 
DIF2 

Cytokines and receptors 
MCP'l 
MIP-1^ 

aHelix coiled-coil rod hombloff 
GR03 (beta) 

mp-a 

MlP^Sa 
ILIORA 

QROa 
HM74 

Immune response 
Orosomucoid 

Complement compouent C3 

Protease inhibitor 9 

Complement component 3a receptor J 

Protease inhibitor 3 

■SLP/Zantileukoprotease 

BLANm/elBBtase inhibitor 

CD$8 

Complement component PFC 

Kinases 
CNK/mK/PLK-lihe 
Cot 

Phosphatases 
PAC'1/DUSP2 
DUSP6 
PHAl 

Signaling molecules 
mFAIPl/A20 
TRAFl 
RanBP2 
0NA16 
PTAFn 

Adhesion and cytoskeleton 
ICAMl 

CEACAMJ (bilary glycoprotein) 
LIMSl 

SNL/actin bundling protein 

OalecHnl/LOALSl 

MEMDIALCAM 

CD44 

TSO-e 

Metabolic 
OTP cychhydrolaae I 
iVDC/jPV2/ubiquinonB reductase 
PSAf^eAproteosome iota) 
UDP-galactoae transporter {SLC35A2) 
PLAU (urokinase) 
-KyTW/ukynurenine hydrolase 
AMPD3 

P4HAl/^roly\ 4'hydroxylasB 
7 Olutamylcysteine synthetase 
ATP6D 
ATPSSl 



D83784 
S76688 
U91616 
L19067 
U06681 
MS1627 
X78710 
J04102 
X76042 
M6d60S 
U15460 
M8904S 
L06695 
S819U 



M69203 . 

M72886 

AF014968 

X04600 

M67731 

X02910 

U64197 

U00672 

Y00081 

X64489 

D10923 



X02644 
K02766 
U71864 
U28488 
L1084S 
X04470 
M93066 

Yooeas 

Md3662 



U66998 
D14497 
U77736 
D46906 

L11829 
U16932 
U73477 



M69466 
U19261 
D42068 
M63904 
D10202 



M24283 
X16364 
U09284 
U03067 
M57710 
U30999 

Ha298X".HT3126 
MSn65 



U19523 
M22638 
X69417 
D84464 
X02419 
U67721 
D12776 
M24486 
L36646 
J06682 
D16469 



16.8 
12.3 
11.6 
8.4 
7.7 
. 7.5 
7.4 
7.4 
6.2 
6.8 
.4.7 
3.8 
3.6 
3.1 



78.7 
48.8 
20.8 
17.6 
17.3 
14.6 
8.1 
7.3 
6.3 
4 

3.8 



20.2 
12.8 
9.5 
6.1 
4.9 
4.7 
4.6 
3.8 
3.6 

16.2 
11.9 
9.6 
4.3 

11.8 
6.3 
3.4 



10 
6.2 
6.6 
6.2 
3.9 

22.4 
6.3 
6.1 
5.9 
4.7 
4.2 
3.9 
3.7 



13.6 
8.6 
8.4 
7.3 
6.4 
5.5 
5 

4.7 
4.5 
4.2 
4 
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Tablb I— continued 



Deocription 



Glycerol kinase 

FACLl 

AK3 

taterferon-inducible 
IS0J6 
Mxl 

ms6 
moo 

OBPl 
PRKR 
IFlTi 
m64 

m$8 

IFP36 

Other 
Oos2 

MIHC/JAPl 

KlAAOm . 

KIAAOna 

SNAP23 

CASP6 

KIAA0113 

KIAA0255 

Hepaioma-derwed OF 

PTQS2 

CD48 

UNCI 19 homolog 

KIAAOlsi 

Rablb 

AnnexinVII 

KIAAOllO 

Adrenomedullin 

AIMl 

KIAA02S0 

P&^l 

Scavenger receptor expressed by endothelial celb 
VHL 



OenBatik™ no. 



Ohaziffo-fbld 



X6d286 
L09229 
X60678 



M13765 
Ma3882 
M24694 
M34465 
M66642 
U60648 
U62618 
M14660 
Ud4606 
U72882 



M7288& 

U87546 

D14661 ■ 

D42087 

U66936 

U28015 

DS0765 

D87444 

Dl64dl 

028236 

M37766 

U40998 

D63486 

XM085660 

J04643 

D148n 

D14874 

U83116 

D87437 

L06176 

D63483 

L16409 



3.6 
3,6 
3.3 



22.6 
19.4 
12.1 
6.2 
4.3 
3.7 
3.6 
3.5 
3.6 
3 



48.8 
7.2 
6.1 
6 
6 

4.8. 
4.8 
4.7 
4.7 
4.6 
4.3 
4.2 
3.9 
3.8 
3.7 
3.7 
8.7 
8.6 
3.2 
3.2 
3.2 
3.1 



(33). Assuming that no other multiply phosphoiylated stath- 
min species had escaped detectioa, analysis of the integrated 
intensities of the P04-stathmin and stathmin spots indicates 
that the percentage of the PO4 form of total cellular stathmin 
inca-eased from 11% to 38% with LPS stimulation (Fig. 2B), 
This is similar to a previous report of an increase from <10% to 
35-40% of the Ser^-phosphorylated form m Jurkat cells stim- 
ulated with anti-CD3 (34). 

Effect of SB203680 on LPS -stimulated Gene Expression^ 
Gene expression analysis of PMNs stimulated with LPS indi- 
cated that the mcoority of genes induced by LPS were unaf- 
fected by prior treatment of PMN with SB203580. Of the 100 
genes up-regulated by LPS, the up-regulation of 23 was inhib- 
ited by greater than 40% (Table VI). The m^ority of these 
genes affected by SB203580 were inhibited by less than 60%, 
whereas only six were inhibited by greater than 80%, aU of 
which represent previously identified interferon-stimulated 
genes. Induction of cytokine genes by LPS, with the exception 
of IL-6t was generally unaffected by SB203686. 

Effect of SB203680 on LPS -stimulated Protein Expression^ 
Similar to the effect of SB203680 on LPS-etimulated gene 
expression, little effect of SB203680 was seen on expression 
levels for the m^^ority of LPS-regulated proteins (Table VID. 
Two exceptions are annexin IH and a-enolase, for which LPS- 
stimulated expression was attenuated in the presence of the 
p38 MAPK inhibitor. 

Comparison of Mkroarray and Proteomics Results-^Of the 
LPS-regulated proteins identified by peptide mass fingerprint- 
ing for which probes were present on the oligonucleotide mi- 
croarray, poor concordance was found at the mRNA level (Table 
Vm). For 13 LPS-up-regulated proteins, 2 corresponding 



mRNA transcripts were up-regulated, 1 was down-regulated, 5 
were unchanged, and 5 were not detected by the Afiymetrix 
chip. For 5 down-regulated proteins, 3 corresponding tran- 
scripts were down-regrulated, 1 was unchanged, and 1 was not 
detected. Varying patterns of LPS regulation emerge for those 
candidates detected at both the transcript and protein level. 
Proteasome /3 chain was up-regulated at both the transcript 
and protein levels (Table Vm). with no notable effect of 
SB203680 on expression at either level. Similarly, CAPl, Rho- 
GAPl, and ficolin 1 were down-regulated at both the mRNA 
transcript and protein level (Table VHI), with no notable effect 
of SB203580. Annexin III was down-regvdated atthe transcript 
level and up-regulated at the protein level, with an inhibitory 
effect of SB203580 seen only at the protein level (Tables VII 
and vm). 

DISCUSSION 

Interaction of bacterial LPS with the human PMN repre- 
sents a model system for studying the activation and output of 
the innate immune system during infection and inflammation. 
A recent publication (35) describes the gene expression changes 
of a cultured inonocytic cell line after infection by the Gram- 
positive bacterium Listeria monocytogenes. The cell wall com- 
ponents of Gram-positive bacteria, like Gram-negative-derived 
LPS (t.e. fi:vm E. coiz), are known to signal through TLRs (36, 
37). Importantly, many of the expression changes found in 
LPS-stimulated PMNs in the present study were also described 
in the bacteria-exposed monocytic cells, indicating that many of 
the gene expression changes seen in bacterial infection are 
likely mediated by TLRs (38. 39) and that the LPS model 
system accurately reflects exposure of immune cells to infec- 
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Tabi£ n 

Human neutrophU genea repressed (>4-fbld) after 4 h o f LPS exposure 

OenBank^ no. 



Description 



Qhfinge 



Kinases 
CAMK, II, gamma 
DiacylglycerU kinase, delta 
PRKCL2fPB,K2 protein kinase C-Uke 2 
MAPKAPK3 

Froteln kinase Htdl, cAMP-dependent 
CAMKII 

Tranflporters 
SLC26AB/sohiie carrier family 25» member 5 
SLC29A1; folate transporter. 
SLC2A3] facilitated glucose transporter 

Metabolic 
Carbonic anhydrase IV 
KNase A family, kS 
Glycogen phosphorytcLse; liver " 
Inositol pdyphosphate-B'phosphaiase 
Inositol 1,3,4-trispkosphate SlG-kinase 
Transketolase 

Protein phosphatase 4, reg, subunit 1 (clone 23840) 

Cytidine deaminase 

MGATl 

HMOXl 

MAN2A2 

CUycogetUn (also represents U31626) 

Structural 
Fibrinogen-like protein (pT49 protein) 
H2AFZ 
PaxiUin . 
Lamin B R 
Dynamin 2 
Actinin 1 
txrTuhulin 

Tubvlin, alf isoform 44 

Transcriptional regulators 
Lymphoblastic leukemia-derived sequence 1 
MAX-interacHng protein 1 
Nuclear factor crythrqid 2 isoform f 
Transducer ofERBB2, 1 
NFATC4 

ATF'2 (CEE-Bpa) 

Receptors 
Lymphoioxin 0 receptor 
Folate receptor 3 (gamma) 



Signaling 
Pix-ct; cool'2 (KIAA0006) 
AJ2ffiB/BhoB 
mp^FlO; TRAIL 

Ca*' binding 
ANXII 
S100A4 
ANXl 

Other 
Proteolipid protein 2 

Protein phosphatase J, a catalytic subunit 

TIMP2 

KlAAOm 

Lipin 2 (KIAA0249) 

LRMP (Jawl) 

CUQBP2 

Clone 23933 

PECAMI 

Delta sUep-inducing peptide 

DiOeorge synd. critical region gene 2 (K1AA0163) 

SELPLG; CD162; selectin P Ugand 



U60360 
D63479 
U8S062 
U09578 

Ha2167-HT2237 
L07044 

J02688 
U17566 
M20681 



L10955 
U64998 
M14636 
U67650 
U61336 
L12711 
U79267 
L2794d 
M66621 
X06986 
L28821 

HG}43S4-HT4604 



Z86531 
M87683 
U14688 
L269S1 
L36983 
M96178 
X01703 

HQ2259-HT2d48 



M226d8 
L07648 
S77768 
D38306 
L41067 
L06616 . 



L04270 
U08471 
U11876 

D26304 
M12174 
U37518 

L19605 
M80563 
X06908 



L09604 

HG1614-HT1614 

M82304 

D83782 

D874S6 

U10486 

U69646 

U79273 

L34667 

Z60781 

D79986 

U26966 



■fold 

-4 

-4.2 

-4.3 

"6.3 

-8 

-9.8 



-4.2 
-4.4 
-6 



-4.4 
-4.6 
-4.6 
-4.6 
-4.7 
-4.8 
-4.9 
-6.4 
-6.4 
-6.4 
-6.8 
-5.9 



"4.2 
-4:7 
-4.9 
-6.9 
"6,2 
-6.7 

-10 

-15 



-4.4 

-4.6 

-6 

-6.9 

-7.8 

•9.6 



-4.4 

-6 

-6.3 



-4.6 
-4.6 . 
-6.6 



-4.3 
-4.8 
-4.8 



-4.9 
-6 
-6.1 
-6.2 
-5.6 
-6.8 
6.9 
-7 
-8 
-8.7 
-9 



tion. Nevertheless, the reliance upon DNA microarrays alone 
affords insight only into the transcriptional response without 
corroboration at the protein level. In the present study, appli- 



cation of both DNA microarray and proteomica technology to 
our model system provides unique insight into both the cellular 
biology of the activated PMN and the responsiveness and reg- 
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Fig. 1. IVo-dimensional PAGE of LPS-exposed human PMNs. 
A and B, colbidal Coomaagie Blue-ataitied pH 3.0-10.0, two-dimen- 
sional PAGE gels (A, control; LPS-exposed) with up-regulated {aolid 
arroxus) and down^regulated {hatched arrows) proteins inmcated. These 
results are representative of six separate ezperimente. C and A colloi- 
dal Coomassie Blue-stained pH 6.0-6,0, two-dimensional PAGE gels 
(C, control; i), LPS-ezposed) with up-regulaled {solid amwa), new 
{solid arrow, open arrowhead), and down-regulated {hatched arrows) 
proteins indicated. LPS-exposed PMNs from three blood donore were 
pooled. 



ulation of its transcriptional and translational machinery. As 
will be discussed below, our study identifies, in particular, 
novel aspects of the LPS-stitnulated PMN transcriptional reg- 
ulation, activity in the innate immune response, signaling, 
cytoskeletal reorganization, and priming for granule release. 

In the present study, the increase in NP-kB transcript abun- 
dance (Table 1) detected by the microarrays corroborates the 
findings of other studies of PMNs and monocytes (40) and 
indicates a mechanism for the responsiveness and scope of the 
PMN transcriptional machinery following LPS exposure. NF- 
kB, recently described to be activated by IPS through the 
TLR/MyD8^interleukin-l receptor-associated kinase pathway 
(1, 4), is the only transcriptional complex reported to be in- 
d.uced by LPS in the PMN. However, because the transcrip- 
tional NF-kB complex has been implicated in the regulation of 
only a portion of tiie genes induced by LPS in this study (data 
not shown), the importance of alternative transcriptional reg- 
ulators in the PMN is clear. Of mterest, several other known 
and putative transcriptional regulators with less well defined 
functions were also up-regulated in the present study, includ- 
ing PLAGL2, a putative zinc-finger protein, XBP-l, MTF-l, 
Et8'2, B-ATF, and DJF-2. On the other hand, LPS-down-regu- 
lated genes include ATF'2 (a known target of p38), NFATC4, 
T0B-1,NF'E2, MXI'l, and LYL-L Althou^ the exact role of 
these gene products in regulating cell function is unknown, 
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ttiese data indicate that the range of transcriptional responses 
in the LPS-stimulated PMN is much broader than previously 
suggested and that the signaling capabilities of the PMN in the 
immune response are thereby likely extended in scope and 
specificity. 

As expected fix)m the literature, the genes for several cyto- 
kines and chemokines, including /L- J IL-6, and MlP-lfi, were 
found to be up-regulated (Table I). On the other hand, the 
notable absence of up-regulated cytokines in the proteomics 
experiments reflects their removal in the post-LPS incubation 
wash performed prior to lysis for two-dimensional-PAGE. Up- 
regulation of these inflammatory mediators is well documented 
in PMNs exposed to LPS and in animal models of LPS-induced 
sepsis syndrome and acute respiratory distress syndrome, a 
PMN-mediated iUneae (41, 42). Several genes in this fam'ily 
were up-regulated that have not, to our knowledge, been de- 
scribed in LPS-stimulated cells, including AfCP-2, GROS^ 
IL-lORAf and HM74, an orphan G proteiurcoupled receptor 
with homology to chembkine receptors. The down-regulation of 
TNFSF20, lymphotoKin b receptor, and TNFAIPl were also 
observed. The modulation of genes involved in cytokine signal- 
ing, including the adapter molecules TRAFl (LPS and TNF 
receptor signaling) and TNPAIPl (TNF receptor signaling) and 
several kinases and phosphatases, may indicate a diange in 
cytokine responsiveness after LPS treatment. Relevant in this 
regard from the proteomics data are: 1) the up-regulation of 
protein phosphatase 1, which has been shown to regulate PMN 
NADPH oxidase activation and translocation (43, 44) and to 
regulate LPS-induced NP-icB activation (45); 2) the down-reg- 
ulation of Rho-GAPl, which has been shown to regulate 
NADPH oxidase activity in the PMN (46); and 3) the up- 
regulation of P04-stathmin (Table IV), a phosphoprotein pos- 
tulated to function as a relayer and integrator of multiple 
signal transduction pathways (34). Several noncytokine, 
nonchemokine genes involved in the immune response were 
also up-regulated, including the complement pathway mem- 
bers C5, C3AR1, and PFC; the protease inhibitors ELANH2 
(elastase inhibitor), SLPI, PI-3, andP/-9; and the acute phase 
protein orosomucoid. LPS regulation of C3AR1 and orosomu- 
.coid expression have not previously been reported. In the pro- 
teomics experiments, the down-regulation of ficolin-1 (Table HI), 
a collectin-like cell surface protein reported to activate the . com- 
plement system and to mediate kdheaion and phagocytosis in 
monocytes but not previously reported in granulocytes (47), may 
represent negative modulation of the innate immune response. 
The finding that genes other than cytokines and chemokines are 
regulated by the PMN in response to LPS indicates that the PMN 
plays a more sophisticated role in host-defense and immunity 
than previously thought. 

Treatment of tlie PMN with LPS lead to the induction of a set 
of genes associated with the anti-viral Type I interferons, 
IFNo/iS. This induction occurs independently of the release of 
IFN or another unidentified soluble factor.^ Furthermore, the 
set of genes expressed is smaller than that induced by IPNo/p, 
as described by Der et al, (12). This may be due to differences in 
the scope of the signaling systems activated by LPS and 
IPNo/^l, or the time course of analysis of genes in the LPS- 
stimulated PMN. The implication that LPS treatment of PMN 
allows PMN to express anti-viral activity is currently being 
tested. Of interest was the finding that induction of interferon- 
stimulated genes was blocked by pretreatment of PMNs with 
SB203580. Work from our laboratory has indicated that signal 
transducers and activators of transcription activation does not 
occur in response to LPS in PMNs.^ In addition, interferon- 



K. C, Malcolm and Q. S. Worthen, manuscript in preparation. 
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TABI2 in 

Analysis ofpH 3.0-10.0 two-dimensional PAOE gela 
Mean change(-fold} in expression level among six PMN donors is reported. The change in expresaion for the proteins listed was statisticalh 
significant ip < 0.05) as measured by a two-tailed Student's t test. vavioMi^oii/ 



Identiftca'tion (spot no.] 



Swis8-Prot no. 



Estimated 



lliearetical 



Peptides matched/ 
submitted 



Protein 
covered 



Mean 
change 



Up-regulakd 

Proteasome ^ chain (646) P28070 

Annexin III (660) P12429 

Actin/rc^^eftJ (644)*' P02670 

Actin /hi^mcft* (6911" P02670 

at-Enolase (360) P067d3 

Rab-QDP diflsociaUon inhibitor 13 (289) P60396 

Glutathione .^-transferase P (648) P09211 

Pre-B-cell colony enhancing factor [ 1 162) P43490 

Down-regulated 
Ad^lyL cyclase-associated protein 1 (266) QOieiS 

Rho^APl (283) Q07960 

Ficblin 1 (611) 000602 







% 


% 


27/6.7 


29.2/6.72 


9/12 (76%) 


36% 


31/6.7 


36.4/6.6 


14/18 (78%) 


42% 


32/5.6 


(41.7/6.29) 


13/16 (87%) 


(34%) 


30/6.4 


(41.7/6.29) 


14/16 (78%) 


(29%) 


41/5.7 


47.2/7.01 


9/10 (90%) 


24% 


60/6.1 


50.7/6.11 


10/11 (91%) 


26% 


23/6.6 


23.4/6.43 


6/8 (76%) 


41% 


53/7.0 


66.6/6,69 


12/16(76%) 


26% 


65/7.3 


61.7/8.07 


16/22 (73%) 


34% 


50/6.8 


50.4/6.86 


7/9 (78%) 


22% 


33/6.6 


85/6.39 


10/12 (83%) 


26% 



•fold 

1.61 
1.37 
1,74 
1.60 
1.66 
1.24 
i:64 
1.29 



0.63 
0.67 
0.74 



' The theoretical pi and Af^ of native actin are indicated. Protein coverage indicates coverage of native actin." 



Table IV 

Analysis of pH 6.0-6.0 two-dimensional PAGE gels 
Results are from pooled samples for control (n-3) and LPS-exposed in = 3) PMNs from human donors. Expression of the reported proteins waa 
altered > 1.5-fold foUcwmg LPS exposure in two repeat etxperimeftts. "New" designates proteins seen in the LPS gel in two repeat exDeriments but 
not detectable in the corresponding control gels. 



Identification (spot no.) 



Up-regulated 
Protdn-tyrosine kinase 9-like (468) 
Protein phospliatase 1. catalytic subunit, 0 isoform 
1878) 

P04-atathmin [677) 

Nonmuscle myosin heavy chain (1102) 

Putative PO^-nonmusde myosin heavy diain (1101)*' 

Leukocyte elastase inhibitor (318) 

Grancalcin (1004) 

Down-regulated 
AdQio6ylhomo<^y8teinase [324] 

PEST phosphatase interacting protein homolog (234K 



SvfisS'Ptat 
no. 


Estimated 
Affi/pl 


Theoretical 
Mb/pI 


Peptides matched/ 
submitted 


Protein 
covered 


Change 








% 


% 


-fold 


Q9Y8P6° 
P37140 


34/6,81 
38/6.73 


89.5/6.37 
37.2/6.84 


10/14 (71%) 
7/10 (70%) 


34% 
22% 


1.8 
2.0 




18^.36 
146/5,32 
145/5.29 
42/5.71 
24/5.36 


17.3/6.76 
146/6.23 
146/6.23 
42.7/6,9 
24.0/5.02 


9/12 (75%) 
20/21 (95%) 
14/16 (87%) 
9/13 (69%) 
7/10(70%) 


42% 
17% 
13% 
22% 
31% 


2.1* 

New 

New 

2.4 

New 


P23626 
4100162^ 


48/5.82 
- 48/5.80 


47.7/6.04 
47,6/6.36 


7/9 (78%) 
11/13 (86%) 


14% 
30% 


0.4 
0.6 



* TrEMBL accession numl>er^^ -— — — 

^ Accession number and theoretical pl and „ for the unmodieed protein are indicated. 
" A/CBI accession number. 
1^ See text for eatplanation. 

" Among three c^eriments, the ratio of PO^-stathmin expresaion increase, folbwing LPS eipoaure in the presence of SB203680 divided bv that 
m the absence of SB208580, was 0.98. j 
'^Genpept accession number. 

^ This search was performed using average masses measured by linear mode MALDI-TOP MS, 

TABia V 

Analysis of pH 6.6- 6. 7 two-dimensional PAOE gels 
Results are from pooled samples for control (n - 3) and LPS-exposed (n 3) PMNs from human donors. Expression of the reported proteins was 
attereg ^ 1.5-fold foUowmg LPS exposure in two repeat expenmenta. 



Identificalion (spot no.) 



Swias-Prot 
no. 



Estimated 



llieoretical 



Peptides matched/ 
submitted 



Protein 
covered 



Change 



Up-regulaied 

Transaldolase (475) P37837 38/5.95 

Isodtrate dehydrogenase (431) 076874 46/6.25 

Moesin [201) P26038 61/6.09 

a-EnoIase (469) P06783 43/6.64 

Down-regulated 

Calponin H2 (240) Q99439 34/6.65 



% % .fold 

37.6/6,36 13/17(76%) 33% 2 5 

46.7/6.36 7/7 (100%) 13% 2.3 

67.8/6,07 11/13 (85%) 17% 2 1 

47.2/7.01 7/10(70%) 17% 3.8 



38.7/6,94 10/11 (907a) 27% 0.6 



regulatory factor 3, a known regulator of interferon-stimulated 
gene transcription, is not a direct target of p38 kinase.^ There- 
fore, gene expression analysis of LPS-etimulated PMNe has 
uncovered a previously uncharacterized signal transduction 
system that is sensitive to inhibition of p38 MAPK. 
Knowledge of the genes down-regulated by LPS permits the 



development of further hypotheses addressing PMN function in 
the face of infection. Strikingly, several down-regulated genes 
and gene products are structural in nature (e.^. paxiUin, acti- 
nin, calponin H2) (Tables II and V). A known consequence to 
the PMN of LPS exposure is decreased motility (48), Up-regu- 
lation of genes for adhesion molecules (ICAM-1, CD44, AL- 
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ASGQAFELILSPR 



15 



27 





Pio, 2. A, the predicted eequeuce of the tryptic phoaphopeptide in 
PO^-fltathmin (1468.72 Da), The peptide mass measured by MALDI- 
TOF MS and the predicted mass differed by 14 ppm. As indicated, two 
alternate phosphorylation sites are possible: serine 16 and serine 26. JB, 
P04-$tathmin and stathmin were Identified on the control and LPS- 
exposed pH 6.0-6.0 gds. ConaiBtent with phosphorylation, the PO^- 
stathmin spot was distingmshed by a peptide of mass 1468.72 Da (i.e. 
80 Da greater than the peptide of 1388.72 Da seen in the stathmin spot). 
Assuming that no other multiply phosphoiylated stathmin species have 
escaped detection, analysis of the integrated intensities of the PO4- 
stathmin and stathmin spots indicates that the percentage of the PO4 
form of total cellular stathmin has increased from 11% to 38% with LPS 
stimulation. The decrease in integrated intensity for stathmin was 
equal in amount to the increase in PO^-stathmin following LPS 
eacposure. 

CAM, and T8G-6), and down-regulation of genes for structural 
proteins, indicates a genetic basis for this observation. Down- 
regulation of two genes implicated in cytoskeletal regulation, 
Pix-a and RhoB, was also observed. The calcium-binding pro- 
tein S100A4, down-regulated in LPS-treated PMNs (Table 11), 
has been implicated in cell motility and metastasis (49). De^ 
creased motUity may be beneficial in sustaining the inflamma- 
tory response at sites of infection. In addition, LPS treatment 
results in an inhibition of apoptosis (50). Therefore, the longer 
residence time of the PMN at sites of infection is consistent 
with the long term genetically coded changes seen in these 
gene-profiling experiments and indicates that the changes in 
gene expression are functionally relevant to host defense and 
immunity. 

By providing information on post-translational modification, 
the proteomics data may provide further insists into the cy- 



Table VI 

Effect of 83203680 on LPS-sHmulated gene expression 

Genes are reported for which the SB203680/control expression ratio 
is s 0.60. 



Gene name 



-fold change ratio 
(SB203580)&ntrol) 



Chan^ 



t in absence 
B203680 



IS016 

NCR 

Mx-1 

IPI66 

PI'9 

Et8'2 

Rel 

UMSl 

C3ARI 

INDO 

KIAAQ105 

SNAP23 

SLPI 

ELNAm 

HM'74 

PKR 

IFJT4 

Olycerol kinase 
IF154 
IFI58 
IPF36 



0.09. 
0.38 
0 
0 

0.67 
0.69 
0.46 
0.60 
0.68 
0.49 
0.35 
0.41 
0.68 
0.58 
0.49 
0.57 
0 

0.21 
0.12 
0 
0 

0.39 
0.46 



fold 
22.6 
20.8 
19.4 
12.1 
9.6 
7.4 
6.8 
6.2 
6.1 
6.1 
6.2 
5.1 
6.0 
4.7 
4.6 
3.8 
3.7 
3.6 
3.6 
3.6 
3.6 
3.6 
3.0 



Table VII 

Effect ofSB203580 on LPS -stimulated protein expression 



Protein name 



-fold chanm ratio 
580/con 



Up-regulated 
Proteaaome chain 
Aim exit! Ill 
Actin fragment [644] 
Ac\in fragment [591) 
a-Enolase 

Rab<JDP dissociation inhibitor fi 
Glutathione 5-tranaferase P 
Pre-B-cell colony enhancing factor 

Down-regulated 
Adenylyl (yclaae-associated protein 1 
Rho-GAPl 
Ficolin 1 



0.8 
0.6 
0.8 
0.8 
0.6 
1.1 
1.2 
1.2 



1.3 
0.8 
1:0 



'fold 

1.51 
1.37 
1.74 
1.60 
1.66 
1-24 
1.64 
1.29 



0.63 
0.67 
0.74 



toskeletal remodeling effects of LPS upon the PMN. We con- 
tend that the actin fragments identified (Table HI) are unlikely 
to represent technical artifacts. Rather, their specificity (iden- 
tical molecular weigh t/pl among different experiments), statis- 
tically significant up-regulation by LPS, as well as the use of a 
lysis buffer containing chaotropes and multiple protease inhib- 
itors argue instead that these fragments are physiologic con- 
sequences of LPS exposure in the human PMN. More specifi- 
cally, the up-regulation of these fragments following LPS 
exposure (Table HI) suggests that LPS may activate an actin- 
cleaying enzyme, which, in turn, remodels the cytoskeleton. 
Intriguing in this vein, calpain has recently been reported to 
play an important role in cell migration and cytoskeletal orga- 
nization of fibroblasts (51). The possibilities that LPS may 
induce calpain activation and that calpain iactivation may reg- 
ulate cytoskeletal reorganization and motility are currently 
under investigation. An alternative possibili^ is that actin 
cleavage is a marker of neutrophil apoptosis (62). 

Other LPS-regulated proteins may play important roles in 
cytoskeletal reorganization. The up-regulation of protein-ty- 
rosine kinase 9-like (A6-related protein) may modulate LPS- 
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tabi* vin 

LPS-regulated proteins for which a probe was present on the 
Affymetrix chip 

A comparison of corresponding protein and mRNA transcript changes 
following LPS exposure is shown. 



Protdn gJJ^ mRNAdwmge 



•fidd 

Up-regulated 

Proteasome fi chain 1.6 1.9 f 

Leukocyte elastase inhibitor 2.4 . 4.6 f 

Rab-QDI^ 1.24 NO 

Orancalciti New NC 

Transaldolase 2 .6 NC 

Moesin 2.1 NO 

Nonmusde myosin heavy chain New NC 

Glutathione .S-transferase P 1.54 Absent 

Pre-B cell enhancing factor 1.29 Absent 

Isocitrate dehydrogenase 2.3 Absent 

PO^-atathmin 2.1 Absent (atathmin) 

Protein phosphatase 1^ ^ catalytic subunit 2 Absent 

Annexinlll 3.1 3.1 i 

Down-regulated 

Adenylyl cyclase-aasodated protein 1 1.9 2.1 i ' 

Rho-OAP 1 1.6 2.7 i 

Picolin 1 1.4 1.7 } 

Adenosythomocysteinase 2.6 Absent 

CalponihH2 2 NC 



• NC, no measureable change. 

induced actin polymerization, because it bears a high degree of 
homology to twinfilin (A6), an actin monomer-binding protein 
that localizes to sites of rapid filament assembly in cells and is 
believed to regulate actin filament turnover (53). In turn, LPS- 
induced down-regulation of Kho-GTPase activating protein 1 
(Table 10) may regulate twinfilin (and protein-tyrosine kinase 
9-like) activity, because twinfilin has been shown to colocalize 
with Racl and Cdc42 and to be regulated by active Racl in NIH 
3T3 cells (53). Activation of Rho proteins may be facilitated by 
LPS up-regulation of moesin (Table V), because moesin report- 
edly induces the dissociation of Rho from GDI (54). Racl may, 
in turn, promote activation of the actin filament-nudeating 
Arp2/3 complex through interactions with WASP (Wiskott-Al- 
drich S3mdrome protein) family proteins (65) and, interestingly, 
is postulated to regulate the dynamics of both the actin and 
microtubule cytoskeletons via phosphorylation of stathmin (Ta- 
ble IV) (56). Calponin H2 is an actin-binding protein not pre- 
viously reported in PMNs that is postulated to play a role in 
cytoskeletal organization (57). Its down-regulation by LPS (Ta- 
ble V) likely modulates LPS-induced cytoskeletal reorganiza- 
tion. The upr regulation of nonmuscle myosin heavy chain arid a 
putative phosphorylated form of myosin heavy chain (putative 
protein kinase C substrate by prediction rules) in the LPS- 
exposed PMN (Table IV) is of uncertain significance; myosin 
has been implicated in multiple functions in the PMN, includ- 
ing locomotion, fluid pinocytosis, and phagocytosis (58). Of 
interest, however, S100A4 (down-regulated, Table II) has been 
reported to regulate cytoskeletal dynamics by inhibiting pro- 
tein kinase G-mediated phosphorylation of nonmuscle myosin 
heavy chain (59). 

LPS induction of stathmin phosphorylation (Table IV and 
Fig. 2) may represent another mechanism by which the cy- 
toskeleton is remodeled. Stathmin is a phosphoprotein report- 
edly involved in both signal transduction and in regulation of 
the microtubulin filament network; furthermore, phosphoryla- 
tion of stathmin has been reported to modulate its tubulin- 
binding avidity (60). Inferences can be made about both the 
phosphorylation site on PO^-stathmin and the responsible ki- 
nase induced by LPS. Four phosphorylation sites in stathmin 
have been well described: Ser^®, Ser^, Ser^, and Ser^ (32, 33). 



Ser" has been reported as a substrate for Ca^Vcahnodulin 
(CaM)-dependent kinases (32), and Ser^ as primarily a sub- 
strate for p3B and ERK (33), with p34*^ also active but bear- 
ing a 5-fold preference for Ser^ (34). As stated above, the 
phosphopeptide identified in PO^-stathmin, extending from 
residues 16 to 27 (1468,7 Da), is consistent with phosphoryla- 
tion of either Ser" or Ser^ (Fig. 2). Although both p385 and 
p38a MAPK isoforms are expressed in the human PMN, LPS 
has been showii to selectively activate the p38ft isoform in 
human PMNs (9). The p38a isoform, however, has been shown 
to be relatively inactive at Ser^^; in fact, p386 is - 100-fold more 
active at Ser^,,and selective p38a inhibitors do not inhibit the 
stress-activated phosphorylation of stathmin in 293 cells (33). 
Further support for the lack of involvement of p38 signaling in 
phosphotylatlon of stathmin in oiu* system is the apparent lack 
of effect of SBi2O3580 (a selective p38a and p^fi inhibitor) on 
LPS'induced expression of PO^-stathmin (Table IV), Because 
p34cdc2 £g relatively inactive at Ser^ (34), we conclude that the 
phosphoiylation site is likely to be Ser*^, a reported substrate 
of CaM-dependent kinase. Although CaM kinases have previ- 
ously been implicated in gene activation in LPS-exposed my- 
elomonocytic HDll cells (61), stattimin signaling has not, to 
our knowledge, been previously reported in either PMNs or 
lipopolysaccharide signal transduction. 

Cytoskeletal reorganization, a well-described regulator of 
granule release (62), may underlie LPS-induced priming for 
PMN granule release, but several LPS-regulated proteins may 
provide more specific dues. LPS exposure led to increased 
levels of grancalcin, a calcium-binding protein previously de- 
tected in PMNs and ehovm to translocate to granules and 
plastna membrane in the presence of physiologic concentra- 
tions of calcium (63). Similarly, annexin HI, a calcium-binding 
protein highly expressed in PMN granule membranes and im- 
plicated in calcium-mediated secretion (64) and in granule fii- 
sion (65), was also found to be up-regulated. Exocytosis of 
granule contents may also be facilitated by LPS up-regulation 
of Rab-GDP dissociation inhibitor (Table HI), which has been 
proposed to recycle Rab after vesicle fusion by extracting it 
from the membrane and loading it onto newly formed transport 
intermediates (66). 

Parallel use of DNA microarrays and proteomics affords a 
powerful strategy for comparison of corresponding mRNA tran- 
scripts and proteins, thereby affording new insight into the 
mechanisms by which the cell regulates its signaling responses 
to the external environment. Of interest, a poor correlation was 
found between corresponding transcripts and proteins (Table 
Vni), as reported in other systems (17, 18). The finding in some 
cases of unchanged transcript abundance in the face of regu- 
lated protein levels indicates post-transcriptional modulation 
following LPS exposure. The finding of undetected transcripts 
in the face of regulated levels of tiie corresponding proteins 
may indicate previous transcription of these genes in an earlier 
state of the myeloid maturation of the PMN, producing stable 
protein species that have undergone post-translational alter- 
ation following LPS exposure. The use of SB203580, a p38 
inhibitor, adds further insights into the mechanisms of LPS 
regulation. At the level of mRNA expression, SB203680 inhib- 
ited 23% of LPS-stimulated genes by ^40% and 11% of genes 
by *t260%; therefore, p38 plays a specific role in gene regulation 
in the PMN. In particular, proteasome /3 chain was up-regu- 
lated at both the mRNA transcript and protein level (Table 
VrCI), with no notable effect of SB203680 on expression at 
either level, consistent with a non-p38-roediated pathway of 
primary transcriptional up-regulation induced by LPS. Simi- 
larly, CAPl, Rho-GAPl, and ficolin 1 were down-regulated at 
both the mRNA transcript and protein level (Table VIID, with 
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no notable effect of SB203580, consistent with a non-p38-me- 
diated pathway of primary transcriptional down-regulation. 
Interestingly, annexin III was down-regulated at tiie transcript 
level and up-regulated at the protein level, with an inhibitory 
effect of SB203580 seen only at the protein level (Table VII), 
consistent with a p38-mediated post-transcriptional up-regula- 
tion induced by LPS. 

Limitations of the present study should be noted. Gene ex- 
pression analysis by cDNA microarraya does not distinguish 
between transcriptional regulation and mHNA stabilization; 
similarlyi two-dimensional PAGE proteomics by itself does not 
distinguish among transcriptional, translational, or post-trans- 
lational regulation of protein abundance. Transcript detection 
by microarray technology is limited to the probes included; 
protein identification by two-dimensional PAGE proteomics is 
limited to well-resolved regions of the gel, may perform less 
well with hydrophobic and high molecular weight proteins, and 
tends to select for more abundant protein species (30). Harvest- 
ing of the LPS-incubated PMNs at 4 h may have prevented 
detection of earlier, transient changes and may have thereby 
introduced artifactual transcript-protein discordance. Further- 
more, the post-LPS incubation, pre^two-dimensional PAGE cell 
washes would be expected to remove secreted proteins firom 
further analysis, with uncertain effects on detected protein 
abundance depending on such factors as the degree of de novo 
synthesis and extent of degranulation/exoc3rtosis« Because pro- 
tein binding of CoOmassie Blue has a limited dynamic range 
and is typically not linear throughout the range of detection, 
image analysis of Coomassie Blue-stained protein spots should 
be considered semi-quantitative. For some protein spots, the 
apparent magnitude of regulation by LPS may have been 
blunted by the spot approaching staining saturation in the 
control geL By limiting our analysis to those protein spots 
common to all twelve pH 3.0-10.0 two-dimensional gels, we 
likely excluded some LPS-regulated proteins that happened to 
be either poorly resolved on a subset of the gels or unmatched 
by the image analysis software. By further limiting the analy- 
sis to those matched spots on the pH 3.0-10.0 gels for which a 
two-tailed t test demonstrated p < 0.05, the list of regulated 
proteins was likely also limited by statistical power. In addition 
to those regidated proteins listed in Table HI, three others were 
up-regulated and three down-regulated with p < 0.09 (data not 
shown). 

Limiting our reported results to those changes that met 
statistical significance ampng the donors carries further impor- 
tant implications. We have encountered a two order of magni- 
tude range of response in unselected donor LPS-induced PMN 
functions, such as TNF-a and superoxide anion release (data 
not shown). The sources of this physiologic heterogeneity re- 
main uncertain but may possibly include such factors as nat- 
ural mutations of the LPS receptor component, TLR4 (67). By 
selecting for LPS effects common to all donors, we may not have 
characterized the range of genomic and proteomic heterogene- 
ity present in the population and thereby may have focused on 
only a narrow portion of a broader biological response to LPS. 
We contend that this reductionist approach is valid because it 
would be expected to enrich for biologically integral responses 
of the PMN to LPS. Nevertheless, correlation of genomic and 
proteomic profiles with functional phenotypes of the PMN may 
bear important diagnostic and therapeutic implications and 
will be pursued in future studies. 

Widespread regulation of numerous noncytokine/chemokine 
genes and proteins in the LPS-stimulated human PMN is a 
novel finding. These data indicate that, despite a narrow scope 
of gene expression in the nonstiroulated state, the terminally 
differentiated, short-lived PMN likely plays a role in the innate 
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immune response that is far more sophisticated and dynamic 
than the simple release of preformed inflammatory mediators. 
Although gene expression appears to be an important mecha- 
nism by which PMNs respond acutely to infection, mRNA tran- 
script/protein concordance is limited, and post-transcriptional 
(and post-translational) modifications also play an important 
role. The alteration of multiple transcriptional regulators, G- 
protein regtilators, PO^-stathmin, and protein phosphatase 1 
indicates that one of the responses to LPS exposure is to modify 
subsequent signaling events by bacterial components or by 
otiier cytokines and chemokines. Finally, the finding tiiat p38 
MAPK mediates LPS regulation of a limited subset of tran- 
scripts and proteins underlines the continuing need to define 
signal transduction cascades in the neutrophil 
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High-throughput technologies, such as proteomic screening and DNA mlcro^OTays produce v-st 
amounts of data requiring comprehensive analytical meihcSs to decipher the rSl^fm 
results. One approach would be to manually search the biomedical iJimre^^om^^^^Z 
an arduous task. We developed an automated llterature-mlning tool. t^^M^^ l^ 
comprehensively summarizes and estimates the relative str«nMh« «# 1m 
relauonshlp, in Medline. Using MedGene. we anat^a^r^.^oi^'™;^^^^ 
comparing breast cancer and normal breast tissue In xL coniel^^^^Z^'^^J^^^ 
correlaUon between the strength of the literature association and the nJir^JS^Xi^^ • 
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Introduction 

At Its current pace, the accumulation ofblomedlcal literature 
outpaces the ability of most researchers and clinicians to stay 
abreast of their own Immediate fields, let alone cover a broader 
range of topics. For example, to follow a single disease eg 
breast cancer, a researcher would have had to scan l30dUreterit 
Journals and read 27 papers per day In 1999.' This problem Is 
aaentuatcd with high-throughput technologies such as DNA 
micro-arrays and proteomlcs. which require the analysis of 
large datasets invoMng thousands of genes, many of which are 
unfamiliar to a particular researcher. In any microarray expert- 
ment. thousands of genes may demonstrate statistically sta- 
nincant expression changes, but only a fraction of these may 
be relevant to the study. The ability to Interpret these datasets 
would be enhanced If they could be compared t6 a compre- 
hensive summaiy of wliat U known about all genes. Thus, there 
Is a need to summarize existing knowledge In a format that 
allows for the rapid analysis of associations between genes and 
diseases or other specific biological conceptt. 

One solution to this problem Is to compile structured dlgiuil 
resources, such as the Breast Cancer Gene Database' and dte 
Tumor Gene Dabibase.' However, as these resources are hand- 
curated. the labor-intensive review process becomes a raie- 
llmltlng step In the growth of the database. As a result these 



•TowtKmcBTOspondtncodiouldbeaildiosed: JI«biio«hms.haivanledu. 
lailBl/piOMonjocC MSJO e aw AmertMn ChemlMl Soc(«ty 



in a systcmaUc fashion. 

An alteriMtlve approach Is automated text mining: a method 
which Involves automated Information extraction ^s^chlrai 

context This approach has beert used successfully In several 
Instances for biological applications. In most eases. It has been 
applied to extract InformaUon about the relationships or 
meractlons that protein, or genes have with one anothVr. In 
the literature or by functional annotation.*-' Thus lar few 
publlcaUon have applied text-mining to examine the global 
relationships between genes and diseases. Perez-lratxeta et al 
automatical^ examined the CO (Gene Ontology) annotation 
of genes and their predicted chromosomal locations In order 
to Identliy genes linked to Inherited disorders.* 

™Zi global understanding of disease devetop- 

rnenl. It would be valuable to Incorporate Information regarding 
aH possible genejJisease relationships. Including blocKcaf 

o2!' ?m''.';*"^'"'°«"»'' «PW«""otogl«l. a, well as 
genetic This Information wouU enable comprehensWe com- 
parisons between large experimental datasets and existlnn 
knowledge In the literature. This would accomplish two thln» 
Hrsl. It would serve to validate experimems by dcmonstratlS 

hSTm.T'^C^ " P'*^"**- S"""". " would 
rapidly h^hllght which genes aie corroborated by the literature 
and which genes are novel In a given context. We have utUlzed 
a computational approach to literature mining to produce a 

Jainialii(IViiitmBc(«aicb»ll2«ll$--4» «» 
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comprehensive set of sens-disease retotlonshlps. In addition, 
we have developed a novel approach to assess the strength of 
each association based on tlw frequency of citation and co- 
cltatloa We applied this tool to help interpret the data from a 
large micro-array gene expression experiment comparing 
normal and cancerous breast tissue. 

Methods 

MedGene Database. MedCene Is a retaUona! database, stor- 
Ing disease and gene Information from NCBI. text mining re- 
sults, statistical scores, and liyperUnks to the primaiy |U- 
erature. MedGene has a web-based user-interface for users to 
query the database (http://hIpseq.med.haivard*dii/IMedCene/). 

Text Milling Algorithms. MeSH iUes wer* downloaded from 
the McSH web site at NLM (Nation Ubraiy of Medicine) (http-y/ 
www.nlm.nlh.gov/mesh/meshhomehtml) and human disease 
categories were selected. LocusUnk Dies wore downloaded from 
the LocusLlnk web site at NCBI (http://www.ncbi.nlh.gov/ 
LocusLlnlol. Ofllclal/preferred gene symbol, official/preferred 
gene name, and gene alti^nattve symbols and names, all 
relevant annotations and URLs for each LocusLlnk record, were 
collected. Gene search terms were used for literatuie searching 
and Included all quallRed gene names, gene symbob, and gene 
family terms. Primary gene keys, predomlnamiy quallAed gene 
family terms and gene ofncia I/preferred symbols, were used 
to Index Medline records. If the ofndal/preferred geie symbols 
did not meet the standards to be an Index, then qualined gene 
ofTiclal/preferred names were used. A local copy of Medline 
records (up to July, 2002) was pre-selected. 

A JAVA module examined the MeSH terms and then Indexed 
each Medline record wUh the appropriate disease tenns. A 
separate JAVA module was used to examine the titles and 
abstracts for gene search terms and then to index the gene- 
related Medline records with the relevant primary gene key(s). 

Statistical Methods. For eve7 gene and disease pair, we 
counted records that were Indexed for both gene and disease 
(double positive hits), for disease only (dteease single hlto), for 
gene only (gene single hits), and for neither gene nor disease 
(double negative hits) to generate a 2 x 2 eonUngency table. 
On the basis of the contingency Uble-frameworfc. we applied 
different statistical methods to estimate the strength of gene- 
disease relationships and evaluated the results. These methods 
nduded chl-squaie analysU. Fisher', exact probabilities, rela- 
tive risk of gene, and relative risk of disease'* (hito// 
hlpseq.med.han.a,d.edu/MedCene/). In addition, we computed 
Ihe product of frequency', which Is the product of the 
proponion of disease/gene double hits to disease single hits 
and the proportion of disease/gene double hits to gene single 
hits To obtain a normal dislrlbutloa we transformed all the 
statistical scores using the natural logarithm. We selected the 
og ofthe product of frequency (LPF) to validate MedCene and 
to use for the analysis with the mlcro-array data. Spearman 
rank-correlatlon coemclents were used to assess the linear 
relationship between LPF and mIcro-array foW <Sa„!eTn 
expression level. ' 

Global Analysis. Diseases with at least SO related genes were 
seleaed for clustering analysis, and the LPF sa,res wwe 
normalized with total score for r^ch disease. Hiemrthlcal 
c ustering was done with the 'Chister* software ami the 
clustering result was vIsuaUzed uslr^ TreeVlewer" (htto // 
rana.lbl.gov/ElsenSoftware.htm). *^ 
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Mlcfo-Aitays. Elghty-nlne breast cancer 
samples (79% ER-posltlve) and 7 normal breast tissue samples 
were selected from the Harvard Breast SPORE frosen tissue 
reposltojy and were representative of the spectrum of histo- 
logical types, grades, end hormone receptor Immuno-pheno- 
types of breast cancer. Blotlrylated cRNA. generated from the 
t«al RNA extracted from the bulk tumor, was hybridized to 
Affymetrtx U9SA oUgo- nucleotide mloo-arrays. These micro- 
arrays consist of 12 400 probes, which represent approximately 
3000 genes. Raw npresslon values were obtained usIim GENE- 
CHIPsoftware from Alfymetrix. and then further analyzed using 
the DNA-Chlp Analyzer (dChIp) custom software. 

Results 

Automated Indexing of Medline Becords by IDisease and 
Gene. To study the gene-disease associations in Ihe Uterature. 
we first compiled complete Usts for human diseases and human 
genes. To index aU Medline records that were relevant to 
human diseases, the Medical Subject Heading (MeSH) Index 
of Medline records was utilized. MeSH is a controlled medical 
vocabulanr from the National Library of Medicine and consists 
ofa set of terms or subject headings that are arranged In both 
an alphabetic and an hierarchical stnicture. Medline records 
are reviewed manually and MeSH terms are added to each with 
software assistance.*" Twenty-three human disease catesory 
headings atong with aU of their child terms (seethe Suppoitlna 
Information. Supplemental Table 1. or visit httpy/hlpseo 
nied.harvard.edu/MedCene/publlcaUon/$ Table l.htmD woe 
^tected from the 2002 MeSH Index creaUng a list of 4033 
human diseases. 

No Index comparable to the MeSH Index exists for genes 
and thus. It was necessary to apply a string search algorithm 
for gene names or symbols found In Medline text. A complete 
list uf genes, gene names, gene symbols, and frequently used 

H,T.' Tl ^ LocusUnk databU at 

NCBI," " which contains 53 259 Independent records keyed 
by an ofHclal gene symbol or name (|une 18*. 2002). For the 
purposes of this study, no dIstlncUon was made between ttenes 

t'!iu Z^^'"',^'^'''^ •»« 'he same nar^e for 

both, differentiating the two only by the use of Italics, If at all 
For the intended use of this study, this tock of distinction Is 
unlikely to have a large effect and may in fact be benelldal. 
Initial attempu to seareh the literature using these lists 

nl^ "^"^ negatives 

(Table 1). False positives primarily arose when the searched 
term had other meanings, whereas false negatives arose liom 
syntax discrepancies necessitating the development of filters 
to reduce these errors. The syntax Issues were readily handled 
^ Including alternate syntax forms in the search terms The 
feise positive cases, caused by duplicative and unrelated 
meanings for ihe terms, were more difficult to manage. Where 
po« Ibie. case sensltWe string mapping reduced InappropSta 
dtatlons. In many cases, however, this was not sufnSenVand 
the terms had to be eliminated entirely, thereby reducing the 
fabeposltlve rate but unavoidably umler-represeming »me 

For the purposes of data tracking, a primaiy gene key was 

^^^mJH/*"""'"' '^"'"'y™ correspond to each 
gene. Medline records were Indexed with a primary gene kev 
when any synonym for that key was found In the title or 
abstract. Case-lnsensltive string mapping was used for all 
searches except as noted above No addlUonal weight was 
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gene symbd/name 
Is not unique 



gene symbol ts 

unrelated abbreviation 
gene symbol/name 

has language meaning 
nonstandard syntax 
unofTlcial gene name/syrnbol 
nonspecUled gene namt 



false positive 



false positive 

false positive 

false negative 
false negative 
false negative 



AMCr-inyeltn 

associated glycoprotein 
M4G-malignancy-assodated 
protetn 

P4*-pamd homologue (mouse). 

.-.PS^!?.'" Wso abbrev. for Pennsylvania) 
WiiS-Wlskou^Aldrlch Syndmme 

(also the word 'was*) 
BAG-I instead of AICi 
f5J Instead of 7P5J 
estn^en receptor instead of 

Estrogen receptor 1 



eliminate this term 



eliminate thbterm 
case-sensiUve string seareh 

add dash term 

add alt gene rUcknaines 

add family stem term 



bis. real «talonship..h««»u™l«,.p^^S^fl^:^^^ 

error. In gonerrterrar rate. maxImlMd «nsith-ity.Vven al «heMp««crf^^«^tfSr ^''"'^^ 



added for multiple occurrences ofa term or the co-occurrence 
of multiple synonyms for the same gene key. 

Medline records were searched wUh all qualified gene 
Idcntiners. such as the official/preferred gene symbol, the 
omclal/preferred gene name, all gene nicknames and all ^ntax 
variants. In situations where there are several members of a 
gene family or splice variants, some authore prefer to use a 
shonened gene family name. e.g.. estrogen receptor.tnstead of 
estrogen receptor 1 (£Sff/). creating a source of false negathres. 
For this reason, gone family stem terms were created for all 
genes that have an alpha or numetleal sufOx (eg.. IL2KA, TCFp. 
ESRl, etc.) and then used to search the literature. The family 
stem terms were handled separately from the specific gene 
names so that It would be clear wlien linkages were made to 
the gene family versus a specific member In (hat family. 

To Improve performance and accuracy, some pre-selectlon 
was applied to the records that were scanned. First, review 
articles were eliminated to avoid redundant treatment of 
citations. Second. non-English Journals were removed because 
the natural language niters were only relevanl to English 
publications. Finally.JoumalsunUkely to contain primaiy data 
about gene-disease relationships were also removed (e g. Jnt 
I Health Bduc. Bedside Nurse, and / Health Em,). Together' 
2oS b ' wi^*"*^ the 12 198 221 Medline publications fluly 

Hanking the Rektlve Strengths of Gene-Disease Assoda- 
tions. In total, there were 618 708 gene-disease co-cltatlons. 
In which 16% (8297) of all studied genes had beeh associated 
to a disease and 96% (3875) of all diseases lud been associated 
to at least one gene. To rank the relative strengths of gene 
disease relationships, we tested several different statistical 
methods and examined Ihe results. With the exception of the 
relative risk estimates, the methods provided similar results 
With r«pea to the rank order of the gene-disease association 
strengths. However, after comparing the resulu to other 
databases and after consulting disease experts, the log of the 
product of frequency (LPF) was selected for further analysis 
because It gave the best results overall. 

VaUdation of MedGene. In developing this tool. It was 
Important to minimize the number of missed genes (false 
negatives) and miscalled genes (false positives). However. In 
situations when these goals were In conlllct. Incluslveness was 
prioritized, To determine the false negative rate In MedCene 
breast cancer was used as a test case because It was associated 
with more genes than any other human disease and because 




Figure 1 Estimation of the false negaUve rete Iw comparison 
with lu.nd<uratod databases. The breast cencer^latedVenes 
IdenMfled by MedGene were comparvd with those listed In 
I^H^i ow*' *tetal»ses Including the Tumor Gene Oatat»se 
Si ^% T"^^ ^l!^ Database(BCG).' GeneCards 
(GC)" and SwlssproL« Genes were contiderM false negaUves 
If they were represemad in et least one of these other datat,ases 
end not In MedGene and their link to breast cancer was sup- 

^.^.VT^ ^ tvtentJL 
were venned by manual review to conHrm their validity. The 
number of genes In each daubase or shared by more than one 
database is Indicated. The false negative rete v^as calculated S 
genes missed at M«IGene (26)/,otal number of nonovertapSS 
genes In other databases (285). ^ 

there were several publh: daubases that link genes to breast 
cancer. We compared the list of breast cancer-related genes 
from MedCene to these databases, Illustrated In Figure I 
Among the 285 distinct breast cancer-related genes that were 
supported by at least one literature clutlon In these hand- 
curated databases, 26 were absent from MedGene, suggestlns 
a false negative rate of approximately 9%. To determine wlw 
these were missed. aU literature references for these genes (80 
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papers) were reviewed manually (see the Supporting Informa- 
tloii. Supplemental Table 2» or visit http://hlpseq.med. 
harvard.edU/MedGene/publlcatlon/s.Tdble 2.htmO. Among 
these papers, most false negatWes were caused by nonstandard 
gene terms or gene terms eliminated by our specificity niters. 
Few genes were missed because they were only mentioned In 
review papers (0.4%) or they appeared only In the body of the 
manuscript but not the abstract or Utie (1.1%). Of note, 
MedGcne Identified approximately 2000 additional breast 
cancer-related genes not listed In any other database. 

To assess the false positive error rate, two complementaiy 
approaches were used: a detailed analysis of one disease and 
a global examination of 1000 diseases. The detailed approach 
examined the false posUivc error rate and Its sources, whereas 
the global approach-tested whether the overall results made 
biomedical sense. 

Using the LPF, U67 genes related to prostate cancer were 
assembled In rank order. We then retrieved approximately 300 
Medline records each for the highest ranked 100 and the lowm 
ranked 200 genes and manually reviewed the titles and 
abstracts to determine the verity of the association. Nearly 80% 
of the highest ranked 100 genes fell Into one of the Ave 
categories that reflect meaningful gene^bease relationships 
(see the Supporting information. Supplemental Table 3. or visit 

http://hlpseq.med.harvard.edu/MedGene/publlcatlon/ 
s.Table 3,htmD. Among the lowest ranked 200 genes, ap* 
proximately 70% reflected true relartonshlps. Of the 600 records 
reviewed, there were only two In which the association between 
the gene and the disease was described as n^athre. Both were 
genes with veiy low scores. In both cases, the authors did not 
argue the absence of any relationship, but rather that a 
particular feature of the gone or protein was not shown to be 
related to human prostate cancerJ^i^ 

The coincidence of some gene symbols with medical ab- 
breviations, chemical abbreviations and biological abbrevia- 
tions resulted In most of the false posIUves (see the Supporting 
Informailon. Supplemental Table 4. or visit http://hlpse- 
q.med.harvard.edu/MedGene/publlcatlon/8^TBblc 4.htm0, em- 
phasizing the importance of the niters that were added In the 
search algorithm (Table I). Without the flltei^, the false positive 
rate more than doubled, and the faise negative rate rose 
dramatically (data not shown). For example, among the papers 
about breast cancer, there were only 12 Medline records that 
referred to ESRi and 10 to ESR2, whereas almost 2000 papers 
mentioned estrogen receptor without specifying ESRI or ESR2i 
this latter group was detected by the family stem term niter! 

To further validate these results, a global analysis of the gene- 
disease relationships described by MedCene was performed 
For this experiment. It was reasoned that the more closely 
related tiie diseases are to one another, the more they will be 
related to the same gene sets. Thus, If the relationships deHned 
by MedCene accurately renecied the literature, then an unsu- 
pervlsed hierarchical clustering of the gene data should group 
diseases In a manner consistent with common medical think- 
ing. Conversely, If the clustered diseases do not make sense 
biologically or medically. It may reHect excessive false positives, 
false negatives, or Inappropriate scoring of the data. 

To execute this experiment, the gene sets and the corre- 
spondlr^LPFvahies for 1000 randomly seieaed diseas^ (each 
with at least 50 gene relationships) were used as a dataset for 
clustering the diseases. A review of the results showed that the 
resulting disease clusters were Indeed logical based upon 
common medical knowledge (see the Supporting Information. 
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Supplemental Figure 1 , or visit httpy/hlpseq.med.hanrardedu/ 
MedCenc/pubUcatlon/sJTigure l.htmO. For example. In one 
such cluster shown In Figure 2, dlat>etes and Its complications 
grouped together and were also ctosety linked to diseases 
associated with starvation states* 

The number of genes associated with a given disease can 
be estimated by adjusting the MedCene number up by the false 
negatWe rate (-9%) and down by the false poslthre rate (^-26% 
on average). Using this, the average disease has 103.7 ± 45.3 
(mean ± $.d.) genes associated with it although the range Is 
quite broad with 2359 genes related to breast cancer, 2122 
genes related to lung cancer and no genes related to a number 
of diseases. 

Applying MedCene to the Analysis of Ui^ Datasets. Access 
to a comprehensive summary of the genes linked to human 
diseases provided an opportunity to analyze data obtained from 
a hlgh-throughput experiment* We compared the MedCene 
breast cancer gene list to a gene exprcsskin data set generated 
from a micro-array analysis comparlr^ breast cancer and 
normal breast tissue samples. Micro-array analysis tdentlDed 
2286 genes that had greater than a Nfold difference In mean 
expression level between breast cancer samples and normal 
breast samples. Using MedCene. we sorted the 2286 genes Into 
four classes: 555 genes directly linked to breast cancer Ui the 
literature by gene term search (first-degree association by gene 
name): 328 genes directly linked by family term search (first- 
degree association by family terra); 1021 genes linked to breast 
cancer only througli other breast cancer genes (second-degree 
association); and 505 genes not previously associated with 
breast cancer, (See the Supporting Information, Supplemental 
Figure 2, or visit http://hlpscq.med.harvBrd.edu/MedCene/ 
publlcatlon/s_Flgure 2.htmL) Among the 505 previously un- 
related genes. 467 were either newly Identined genes or genes 
that had not previously been associated wUh any disease 
Among the remaining 38 genes. 0 had been related to other 
cancers, specincally esophageal coton. uterine, skin, and cervix. 

To determine whether the geries highlighted by the micro- 
array analysis were more likely to have been previously linked 
to breast cancer In the literature, we aeated a two^menslonal 
plot of the fold change of expression level between breast 
cancer and normal tissue versus the literature score (LPF) 
(Figure 3A). There was a broad spread of expession changes 
among the genes directly linked to breast cancer ranging from 
Iws than Ufold change (68%) to over 40-fold (0.3%). Notably 
the majority of genes with greater than lO-fold expression 
changes were linked to breast cancer by nrst-degrec associa- 
tion. 

Among ail 754 ^enes directly linked to breast cancer In the 
literature, there was no con-elation between LPF and micro- 
array fold change (r « 0.018, p-vaiue - 0.62). However, when 
we stratined the analysis based on the magnitude of the fold 
change, we observed an Inaeaslng trend In correlation (Figure 
3B) suggesting that genes with a more substantial change In 
expression level were more likely to have a strong^- association 
In the literature. For genes that had lO-fold change or more in 
expression level, the co^elation Increased to 0.41 (p-walue 
O.OS). ^ 

When we evaluated the micro-array data separately for ER 
posltwe arid ER negative tumors, the trend In correlation 
between fold change and literature score was highly dependent 
on estrogen receptor status. Interestingly, there was a simitar 
trend In correlation for ER positive tumors, but no trend In 
correlation for ER negative tumors. 
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Mnally* to validate our nndlngs. we computed similar cor- 
relations between the breast cancer expression data and 
LPF scores generated by MedCene for hypertension, a 



dteease unrelated to breast cancer. As expected, we did not 
observe an Inacaslng trend In correlation for hyperten- 
sion. 
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Analysis oT Data Using Advanced Uumtun Mining 
Table 2. Top 25 Genes Related to Selected Human Diseases* 
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breast neoplasms 



estrogen receptor 
PGR 

ERSB2 * 
BRCAl 
BRCA2 
EGi'H 

cms 

TFFl 

PSEN2 

JPS$ 

CESS 
CEACAM5 



ERBB3 - 

cydtn 

COXSA 

cathepsin 

ERBB4 

TRAM 

CCNDI 

EOF 

MUCi 



Insuiin-llke 
ECU 

mucin 
FCF3 



hypertension 



rheumatoid anhiiUs 



bipolar disorder 



REN 
DBF 
LEP 
AGT 
iNS 

kaUlkretn 
ACE 

endothdtn 

sime 

BDK 

DIANPH 
SARi 



-FUi 

C059 
ALB 

CYPIJBZ 
MAT2B 
eriglotensin 
mceptor 

AGHU 

NPPA 

LVM 

DBH 
NPy 

POKiC 
neuropeptide 



RA 

TNFRSFJQA 
CRP 
AS 
ESRi 

HLA^DRBl 
DRI 

Inteileukln 

TNF 

R6 

collagen 
iLlA 

AOi - 

TNFRSFIZ 

JL2 

emu 

tnterleukln I 
matrix 

metalloprotetnase 
Interferon 

am 

JLI7 

m4P3 
SIL 



ERDAI 

SNAP29 

PFKL 

DRI32 

TRH 

1MPA2 

HTR3A 

DRD3 

REM 

KCNN3 

DRD4 
HTR2C 

RSLN 

DBH 

MAQA 

COMT 

HTR2A 

SYNIi 

INPPl 
NEDD4L 
FRAI3C 
transducer or 



BAIAP3 

ATPIB3 
DRDS 



atheroaclerosis 



apollpoprotein 

APOE 

LDLR 

ELN 

ARGl 

APOB 

APOAI 

MSRi 

LPL 

pom 

plasminogen 
acttVBtor Inhibitor 
PLC 

vascular ceil 
adhesion molecule 
AJOHi 
VWF 

ms 

AS^ 

OlMl 

collagen 

MCP 

lipoprotein 
AP0A2 
Intercellular 
adhesion molecule 
BAB27A 



Discussion 

The Human Genome Project heralded a new er* in biological 
• research where the emphasis on understanding speclHc path- 
ways has expanded to global studies of genomic organization 
and biological systems. High-throughput technologies can 
provide novel Insight Into comprehenshre biological liinctlOn 
but also Introduces new challenges. The utlllgr of these 
technologies Is limited to the ability to generate, analyze, and 
Interpret large gene lists. MedGene. a relational database 
derived by mining the Information In Medline, was created to 
address this need, MedCcnc users can query for a rank-ordered 
list of human gene*dlsease relationships (Table Z) for one or 
more diseases. Each entry Is hyperllnked to the original papers 
supporting each association and to other relevant databases. 

MedGene is an innovative extension of previous text mining 
approaches. Perez-lratteta et al. used the CO annotation and 
iheir chromosomal locations to predict genes that may con- 
tribute to inherited disorders * MedGene takes a broader view 
and Includes all diseases and all possible geno*dlsease relation- 
ships. Furthermore. MedCcne utilizes co-cltatlon to Indicate a 
relationship rather than CO annotation, which Is limited to the 
subset of genes that have GO annotation. Our approach Is 
complementary to that taken by Chaussabel and Sher. who 
used the frequency of co-clied terms to cluster genes Into a 
hierarchy of gene-gene relationships.* 

A unique aspect of this tool Is the ablli^ to assess the i^aUve 
strengths of gene-disease relationships based on the frequency 
of both co-cltatlon and single citation. This presupposes that 
most co-cltatlons describe a positive association, often referred 
to as publication bias" and Is supported by our observations 



that negative associations are rare (Supplemental Table 3: 
http://hipseq.med,harvard,edu/MedGene/publlcation/8 Ta- 
ble 3.htm5. Of course, relationships estabUshed by frequency 
of co-cltatlon do not necessarily represent a true biological link; 
however. It Is strong evidence to support a twe relationship. 

Another Important feature of MedGene Is the Implementa- 
tion of software filters that substantially reduced the error rate 
We estimate that less than 10%ofaU associations were missed 
and at least 70% of even the weakest assocladons were real 
For this study, all of the filters that we applied were general 
ones. e.g., expanding the list of all gene names to address the 
dlfTerent ^ntax forms used by different Journals, eliminating 
gene names that correspond to common English words, etc. 
The majority of the remaining search term ambiguities were 
Idiosyncratic and dlfHcult t0 Identify systematically without 
causing a stgnlHcant rise In false negath^cs. Alternative ap< 
proaches. such as the examination of the nearest neighbor 
terms, need to be considered to further reduce the false positive 
rate. 

It Is not uncommon to see expression changes In micro- 
array experiments as small as Z-fold reported In the literature. 
Even when these expression changes are statistically significant^ 
It Is not always clear If they are biologically meaningful. When 
comparing expression levels of disease to normal tissue, one 
expects an enrichment of known dlsease-related genes to 
appear In the altered expression group. MedCene provided a 
unique opportunity to test thU notion In the context of existing 
knowledge on a novel breast cancer micro-array dataset For 
genes displaying a 5-fold change or less In tumoni compared 
to normal, there was no evidence of a correlation between 
altered gene expression and a knovm role In the disease. This 
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TabU 3. Genes wlih Large Expression Changes In ER- but 
Not In EP4- Breast Tunnors 



gene symbol 



fohd Chang* (BR'*-) 



fold change (ER*-) 



KiaHBl 

Bm 

DXKI 

ZICJ 

TlRl 

KIAAQSSO 

CDKNS 

EBI2 

CZMB 

STKI8 

CPR49 

mow 

UDl 

P0LB2 

HMG4 

BClZLll 

LRP8 

CCNB2 

CCNE2 

FCB 

fWSL6 

HIF5 

5ERPINH2 

YAPl 

LPHB 

TCEA2 

TFFl 

coimt 

POPS 

BPACJ 

PDZKJ 

VECfC 

MUC6 

SERPINA5 

MEJSI 

CAI2 



1.0 
1.2 
1.2 
1.9 
1.0 
2.6 
1.0 
4.0 
3.8 
4.7 
1.0 
1.6 
-1.0 
4.2 
4.4 
-1.2 
2.9 
1.0 
4.0 
-4.3 
2.9 
3.0 
4.6 
1.0 
-U 
-I.I 
1.3 
-4.1 
1.1 
-4.6 
-IJ 
-2.8 
-1.4 
-l.O 
-1.6. 
2.4 



610.8 
89.4 
69.8 
59.6 
38.5 
33.2 
30.6 
27.9 
21.9 
18.6 
14.6 
14.4 
13.S 
13.0 
12.9 
12.3 
12.2 
11.8 
11.6 
11.1 
10.9 
10.2 
10.2 
10.0 
-10.4 
-10.8 
-11.4 
-15.7 
-16.2 
-22.3 
-36.8 
-51.5 
-64.9 
-83.1 
-85.9 
-150.3 



Tah\t 3. MedCone Identified h set of relatively urelentudted, yet hlflhly 
expms^ g«)e» In ER ncgatKe« but nut ER positive bn»st lumon. AU of 
these genes have either never been co<ited wUh breast cancer or liave a 
weak association excepr ihoso marked with an *. 

reflects the many genes whose role In breast cancer may not 
Involve large changes In expression In sporadic tumors (e.g., 
BRCAl and BFCA2^ and genes whose modest changes In 
expression may be unrelated to the disease. Strikingly, among 
genes with a 10-fold change or more in expr^ston level, there 
was a strong and slgnlftcant correlation between expression 
level and a published role In the disease, providing the first 
global validation of the micro-an-ay approach to Identifying 
disease-specific genes. 

The results derived from MedGene have two Implications. 
Firsi. a careful hunt for corroborating evidence of a role in 
breast cancer should precede any further study of genes with 
less than 5-fold expression level changes. Second, any genes 
with 10- fold changes or more arc likely to be related to breast 
cancer and wan-ant attenUon. It Is likely that this threshold will 
change depending on the disease as well as the experiment. 

Interestingly, the observed correlation was on^y found among 
ER-posItlve lumora. not ER-negative. This may reflect a bias 
In the llternlure to study the more prevalent type of tumor In 
the population. Furthermore, this emphasizes that caution 
must be taken when Interpreting experiments that may contain 
subpopulatlons that behave very differently. The MedCene 
approach Identified a set of relatively understudied, yet highly 
expressed genes In ER-negatlve tumors that are worthy of 
further examination (Table 3). 
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In conclusion, we have developed an automated method of 
summarizing and organizing the vast biomedical literature. To 
our knowledge, the resulUng database Is the most comprehen- 
sive and accurate of Its kind. By generating a score that reRects 
the strength of the assoclattoa U provides an Important tool 
for the rapid and flexible analysis of large datas^ from various 
high-throughput saeenlng experiments. Furthermore, it can 
be used for selecting subsets of genes for functional studies, 
for building disease-specific anays. for looking at genes com- 
mon to multiple diseases and various other high-throughput 
applications. In the future. It will be possible to enhance the 
utility of the MedGene database by building links between 
genes and other MeSH terms as well as other biological 
processes and concepts, such as cell dhrlslon and responses to 
small molecules. 
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Discordant Protein and mRNA Expression in 
Lung Adenocarcinomas* 

Quoan Chen*, Tarek G. Qharlbt, Chiang^Ching Huang§, Jeremy M. Q. Taylor§, 
David E. MisekH, Sharon L R. KarcHa||, Thomas J, Giordano**, Mark D. lannettonifc 
IVIark B. Orringent:, Samir M. HanashU and David G. Beert:44 



The relationship between gene expression measured at 
the mRNA level and the corresponding protein level Is not 
well characterized In human cancer. In this study, we 
compared mRNA and protein expression for a cohort of 
genes In the same lung adenocarcinomas. The abun- 
dance of 165 protein spots representing 98 Individual 
genes was analyzed In 7Q lung adenocarcinomas and nine 
non-neoplastic lung tissues using two-dimensional poly- 
acrylamide gel electrophoresis. Specific polypeptides 
were identified using matrix-assisted laser desorptlon/ 
Ionization mass spectrometry. For the same 85 samples, 
mRNA levels were determined using oligonucleotide ml- 
croarrays, allowing a comparative analysis of mRNA and 
protein expression among the 165 protein spots. Twenty- 
eight of the 165 protein spots (17%) or 21 of 98 genes 
(21.4%) had a statistically significant correlation between 
protein and mRNA expression (r > 0.2445; p < 0.05); 
however, among alt 165 proteins the correlation coeffi- 
cient values {t) ranged from -6.467 to 0442. Correlation 
coefficient values were not related to protein abundance. 
Further, no significant correlation between mRNA and 
protein expression was found (r »= -0.025) If the average 
levels of mRNA or protein among ail samples were applied 
across the 165 protein spots (98 genes). The mRNA/ 
protein correlation coefficient also varied among pro- 
teins with multiple Isoforms, Indicating potentially sep- 
arate isoform-specific mechanisms for the regulation of 
protein abundance. Among the 21 genes with a signifi- 
cant correlation between mRNA and protein, five genes 
differed significantly between stage I and stage III lung 
adenocarcinomas. Using a quantitative analysis of mRNA 
and protein expression within the same lung adenocarci- 
nomas, we showed that only a subset of the proteins 
exhibited a significant correlation with mRNA abundance. 
Molecular & Cellular Proteomlcs 1:M4--3i3, 200Z 



Lung cancer is the leading cause of cancer death tor both 
men and women In the United States. Adenocarcinomas of 
the lung comprise -40% of ail new cases of non-small cell 
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lung cancer and are now the most common histologic type. 
Functional genomics, bmadly defined as the comprehensive 
analysis of genes and their products, have become a recent 
focus of the life sciences (1). Application of these approaches to 
lung adenocarcinomas has the potential to aid in the identifica- 
tion of high risk patients with resectable early stage lung cancer 
that may benefit troro adjuvant therapy, as well as to IdentHy 
new therapeutic targets. In human lung carx;er, however, little is 
cunently understood regarding the relationship between gene 
expression as determined by measuring mRNA levels and the 
con-esponding abundance of the protein products. 

A number of powerful techniques for analysis of gene ex- 
pression have been used Including differential display (2), 
serial analysis of gene expression (3), DNA microan-ays (4), 
and proteomlcs via two-dimensional polyacrylamide gel elec- 
trophoresis and mass spectrometry (5). Bioinformatics tools 
have also been developed to help determine quantitative 
mRNA/protein expression profiles of aB types of cells and 
tissues (6) and now can be applied to benign and malignant 
tumors. DNA mlcroan-ays (cDNA and olljgonucleotide) pennlt 
the parallel assessment of thousands of genes and have been 
utilized In gene expression monitoring (7), polymorphism anal- 
ysis (8), and DNA sequencing (9). Recent studies have fo- 
cused on classification or identification of subgroups of lung 
tumors using DNA microan-ays (10. 11). The use of mRNA 
expression patterns by themselves, however. Is insufficient for 
understanding the expression of protein products, as addi- 
tional post-transcriptlonal mechanisms, including protein 
translation, post-translationai modification, and degradation, 
may influence the level of a protein present In a given cell or 
tissue, Proteomic analyses, a complementary technology to 
D^4A microarrays for monitoring gene expression, involves 
protein separation and quantitative assessment of protein 
spots using 2D''"PAGE and protein identification using mass 
spectrometry. By combining proteomic and transcriptional 
analyses of the same samples, however, It may be possible to 
understand the complex mechanisms Influencing protein ex- 
pression in human cancer. 

In this study, we detemnlned mRNA and protein levels for 
165 proteins (98 genes) In 76 lung adenocarcinomas and nine 



' The abbreviations used are: 2D, two-dimensional; MALDI-MS, 
matrix-assisted laser desorption/ionlzatlon mass spectrometry. 
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Protein and mRNA Correlation In Lung Adenocarcinomas 



Table I 

Cormlatton coefficients of protein and mRNA where only one spat was present on 2D gels 
r. correlation coefficient value > 0.2445; p < 0.05. Values In boldface are slgnlficam at p < 0,05. 



Spot 
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Gene name 




1104 
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0A^7 
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H8.77840 
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Hs. 10958 


DJ-1 
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Hs.75428 
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0264 
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Hs.1 11334 
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0963 


Hs 30071 1 
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Hs.4745 


r wivlw 




0906 


Hs 234489 


LDHR 
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1171 




novii 

VUA 1 1 
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Hfi 181013 
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no. 1 •tvou 
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1193 




nUCO / & 


U. 1932 


01 72 
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UCD AQD 

nor/\yb 


0.1872 


0777 


14 a Q7Q 
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0.1856 


1249 


Mft 00(y7Qt\ 




0.1773 


1685 


nSi r V 1 oo 


TVM 


0.1 732 


1205 




nPHTi 


0.1588 


1230 




T^PT^ 

TPTl 


0.1466 


0603 


ns. lo 1 oof 


LAMni 


0.1463 


1358 




A DDT 


0.1399 


1410 


no.O£ MO 


LJUT 


0,1213 


1826 


1 lO'lTfi 

ns, 1 iicofo 


LI MSI 


0.1213 


0871 






0.1122 


0289 


Met 


UU 1 OA 


0.1106 


1143 


not 1 1 two 




0.0997 


1456 


no. 1 1 Ouoo 


K\KAC4 

iNMci 


0.0932 


1598 






0.0905 


1354 


89761 
ns.os r D 1 




0.0904 


1445 


Hfi 


niri 


A AOitO 


1479 


Hft 1 774S6 
no. ill fvv 




0.0746 


0608 


Hs 182265 




0.0439 


1071 


Hs 10842 


RAM 


0.0Z77 


0991 


Hs 297939 


CTSB 


0.0404 


0842 


Hs77274 


PLAU 




0823 


Hs. 188248 


T1 


f\ DO 


0613 


Hs.1247 




A A4 7<2 
0.0 1 fO 


1338 


Hs.104143 


CLTA 


U.U l£0 


0902 


Hs.5123 




w.u 1 1 / 


1688 


H8.1473 


GRP 




0265 


Hs.274402 


HSPA1R 


_A nA7i 


1414 


Hs,77541 


ARF5 


- n nnoA 


0710 


H8.97206 


HIP1 


_n nii^ 
-ViU 1 1*1 


0532 


Hs.1 70328 


MSN 




0525 


Hs.284255 


ALPP 


-0.0148 


0513 


Hs.76901 


PDIR 


-0.0289 


1659 


HS.2S6697 


HINT 


-0.0312 


1262 


H8.7016 


RAB7 


-0.0362 


0190 


H8.184411 


ALB 


-0.0470 


0948 


Hs.2795 


LDHA 


-0.0549 


0502 


Hs.1 80532 


GPI 


-0.0575 


0152 


Hs.75410 


HSPA6 


-0.0640 


1054 


H$.74276 


CUC1 


-0.0686 


0709 


H8.253495 


SFTPD 


-0.0936 


0867 


Hs.78996 


PCNA 


-0.0982 


0165 


Hs.1 8041 4 


HSPA8 


-0.1014 


1109 


Hs.75103 


YWHAZ 


-0.1018 


0137 


Hs.554 


SSA2 


-0.1032 



Protein name 



14-3-3 tr 

Annexin IV 

DJ-1 protein/MER5 

Superoxide dismutase (Cu-Zn) 

Qalectin 1 

Transformation up-regulated nuclear protein 
Fenitln light chain 
Annexin V 

26 S proteasome p28 

LHactate dehydrogenase H chain (LDH-B) 

COX 11 

Phosphoglycerate mutase 
Dfhydrollpoamlde dehydrogenase precursor 
Antioxidant enzyme AOE372 
GRP75 

Pyruvate dehydrogenase E1-/3 subunit precursor 
Glutathione ^^transferase pi (GST-pO 
Thioredoxin 

HQ phosphoribosyltranslerase 
Translatlonally controlled tumor protein (TCTP) 
LAMR 

Adenine phosphorlbosyl transferase 
dUTP pyrophosphatase (dUTPase) 
Plnch-2 protein 

Carbonic anhydrase-related protein; Syrrtaxln 
Chaperonin-like protein 

Glutathione S-transferase homolog (GST homolog) 
Nm23 {HDPKA) 
RliG (U32331) 

FIFO-type ATP synthase subunit d 
Huntingtin interacting protein 2 (HiP2) 
Amyloid B4A 
Cytokeratin 19 

QTP-tolnding nuclear protein RAN{rC4) 
Cathepsin B 

Urokinase plasminogen activator 

p 1,4-9alactosyl transferase 

Apollpoprotein A4 (ApoA4) 

Clathrin light chain A 

Cytosolic Inorganic pyrophosphatase 

Preprogastrin-releasing peptide 

Heat shock-Induced protein 

ADP-ribosylatlon factor 1 

Huntingtin interacting protein 1 (HiP1) 

Moesln/E 

Alkaline phosphate, placental 

Protein disulfide Isomerase-related protein 5 

Protein kinase C inhibitor 

Rab 7 protein 

Albumin 

Lactate dehydrogonase-A (LDHA) 

Hspd9 

GRP78 

Nuclear chloride channel (RNCC protein) 
Pulmonary surfactant protein D 
PCNA 

Heat shock cognate protein. 71 kDa 
14-3-3 i/A 
Ro/ss-A antigen 
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Table \— continued 



Spot 



Unigene 



Gene name 



Protein name 



0278 
1769 
0089 
2511 
1739 
1138 
2533 



Hs.4112 

Hs.9614 

H8.74335 

H8.163179 

Hs.16488 

Hs.301961 

H8.77060 



TCP1 

NPM1 

HSPCB 

FABP5 

CALR 

GSTM4 

PSMB8 



-0.1237 T-complex protein I, a subunit 

-0.1738 B23/numatrln 

-0.2049 HspSO 

-0.2109 E-FABP/FABP5 

-0.2344 . Calreticulln 32 

-0.2438 Glutathione S-transferase M4 (GST m4) 

-0.2S12 Macropaln subunit A 



non-neoplastic lung tissues. Protein levels were detemilned 
using quantitative 2D-PAQE analysis, and the separated pro- 
tein polypeptides were identified using matrix-assisted laser 
desorptlon/ionlzatlon mass spectrometry (MALDI-MS). The 
corresponding mRNA levels for the identified proteins within 
the same samples were determined using oligonucleotide 
microan-ays. Correlation analyses showed that protein abun- 
dance Is likely a reflection of the transcription for a subset of 
proteins, but translation and post-translatlonal modifications 
also appear to influence the expression levels of many indi- 
vidual proteins In lung adenocarcinomas. 

EXPERIMENTAL PROCEDURES 

77ssueS"-Rfty-seven stage I and 19 stage III lung adenocarcino- 
mas, as well as nine non-neoplastic lung tissue samples, were used 
for protein and mRNA analyses. Patient consent was obtained, and 
the project was approved by the Institutional Review Board. All tis- 
sues were obtained after resection at the University of Michigan 
Hearth System between May 1991 and July 1998. Tissues were aJI 
snap-frozen In liquid nitrogen and then stored at -80 'C. The patients 
Included 46 females and 30 males ranging In age from 40.9 to 84.6 
(average 63.8) years. Most patients (66/76) demonstrated a positive 
smoking history. Sixty-one tumor samples were classified as bron- 
chial-derived, 14 were classified as bronchoalveolar, and one had 
txjth features. Eighteen tumor samples were classified as well differ- 
entiated, 38 were classified as moderate, and 19 were classified as 
poorly differentiated adenocarcinomas. Hematoxylln-stalned cryostat 
sections. (5 /um), prepared from the same tumor pieces to be utilized 
for protein and mRNA Isolation, were evaluated by a pathologist and 
compared with hematoxylin- and eosin-stained sections made from 
paraffin blocks of the same tumors. Specimens were excluded from 
analysis if they showed unclear or mixed histology (e.g. adenosqua- 
mous), tumor celluIaHty less than 70%, potential metastatic origin as 
indicated by previous tumor history, extensive lymphocytic Infiltration, 
or fibrosis or if the patient had received prior chemotherapy or 
radiotherapy. 

Oligonucleotide Array Hybridization -The HuGeneFL oligonucleo- 
tide an-ays (Affymetrix, Santa Clara, CAi containing 6800 genes were 
used In this study, Total RNA was Isolated from all samples using 
Trizol reagent (Invitrogen). The resulting RNA was then subjected to 
further purification using RNeasy spin columns (Qiagen). Preparatton 
of cRNA, hybridization, and scanning of the HuGeneFL arrays were 
perfomned according to the manufacturer's protocol (Affymetrix, 
Santa Clam, OA). Data analysis was perfonned using GeneChlp 4.0 
software. The gene expression profile of each tumor was nonnalized 
to the median gene expression profile for the entire sample. Details of 
data trimming and nonnalization are described elsewhere (1 1), 

SD-PAQE and Quantitative Protein Analysis -Tissue for both pro- 
tein and mRNA isolation came from contiguous areas of each sample. 
Protein separation using 2D-PAQE, silver staining, and digitization 



were perfonned as described previously (12, 13). Our 2D-PAQE sys- 
tem allows us to mn 20 gels at one time (one batch). Spot detection 
and quantification were accomplished utilizing Bio Image Visage Sys- 
tem software (Bioimage Corp., Ann Arbor, Ml). The Integrated inten- 
sity of each spot was calculated as the measured optical density 
units X mm'^. Of the total possible 2000 spots detectable on each gel, 
820 spots on the gel of each sample were matched using a Gel-ed 
match program with the same spots on a chosen "master" gel. In 
each sample, 250 ubiquitously expressed reference spots were used 
to adjust for variations between gels, such as that created by subtle 
differences In protein loading or gel staining, Slight differences be- 
cause of batch were corrected after spot-size quantification. 

Mass Spectrometry and 2D Western Blotting- Preparative 2D gels 
were run using extracts from A549 lung adenocarcinoma cells (ob- 
tained from ATCC) and using the identical experimental conditions as 
the analytical 2D gels, except 30% more protein was loaded. The 
resolved protein gels were silver-stained using successive incuba- 
tions in 0.02% sodium thlosulfate for 2 mln, 0.1% silver nitrete for 40 
min, and 0.014% formaldehyde plus 2% sodium carbonate for 10 
min. For protein Identification, protein polypeptides undenwent trypsin 
digestion followed by MALDI-MS using a MAUDI-TOF Voyager-DE 
mass spectrometer (Perseptlve Biosystems, Framingham, MA). The 
masses were compared with known trypsin digest databases using 
the MS-FIT database (University of California. San Francisco; 
prospector.ucsf.edu^csfhtml3.2/msfit,htm). Some of the polypep- 
tides included In the analysis had been identified priorto tWs study on 
the basis of sequencing (14). The Identified protein spots used In this 
paper are shown in Rg. 1A, The method for 2D-PAQE Western blot 
verification was as described previously (1 5). The 2D Western blots of 
GRP58 and Op18 are shown in Fig. 1 , C and E; the others, such as 
GRP78. GRP75, HSP70, HSC70. KRT8. KRT18, KRri9. VImentIn, 
ApoJ, 14-3-3, Annexin I. Annexin II, PGP9.5, DJ-1. GST-pI, and 
PGAM, are described elsewhere.^ 

Statistical Ana/^/s- Missing values were replaced with the mean 
value of the protein spot. The transform x tog (1 + x) was applied 
to nomnalize all protein expression values. The relationship between 
protein and mRNA expression levels within the same samples was 
examined using the Spearman con-elation coefficient analysis (16). To 
Identify potentially significant con-elations between gene and protein 
expression, we used an analytical strategy similar to SAM Signifi- 
cance analysis of mlcroarrays) (17), which uses a permutation tech- 
nique to determine the significance of changes In gene expression 
between different biological states. To obtain permuted correlation 
coefficients between gene and protein expression, genes were ex- 
changed first In such a way that permutated correlation coefficient 
were calculated based on pseudo pairs of genes and proteins. The 
distribution of penTiutated congelation coefficients became stable after 
60 pemiutatlons. This procedure was then repeated 60 times to 
obtain 60 sets of permutated con-elation coefficients. For each of the 
60 pemiutations, the correlations of genes and proteins were ranked 



^ Chen et al., submitted for publication. 
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TABL£ II 



Correlation coefficients of protein and mRNA w/iene nnultiple isoforms were present on 2D gels 
r*, correlation coefficient value > 0.2445; p < 0.05. Values In boldface are signfflcant at p < 0.05. 



Spot 


Untgene 


Gene name 


r' 


1494 




1 APIA 


U.4UUO 


0957 


Ha 778d9 


1 "Ivl 1 




0353 


H<t 989101 




A 4aAa 
U.oeuZ 


0655 


no. lva*T/D 




0.0080 


1198 


W» 41707 




0.3868 






TDI1 


0,3395 




Ma R<;114 
no*Q9 1 l*f 


r\n 1 1 0 


0.3335 




nstO 19 lo 


LArlo 


0.3234 




nS.o 1:110 


LAP 10 


0.3154 


I lOl 




ANXAi 


0.3102 






KRT8 


0.3049 


UOUO 


lie 0077i;'S 


VIM 


0,293d 




U— OQ7TCQ 


VIM 


0.2809 


tot** 


11-, 7CQi Q 


AKR1B1 


0.2790 




nS./Ot>44 


YWHAH 


0,2775 




Lie* 


ANXAI 


0.2612 




HS.bOl 14 


KRT18 


0.2801 


1 10£ 


ns.4i f U/ 


HSPBS 


0.2558 


UoOU 




QRP58 


0.2516 




11-. 7CQi O 


AKR1B1 


-0.2460 


UOO 1 


Lie TCOiO 


AKR1B1 


0.0761 


U090 


Ue 7l«^iQ 

nS./Oo lo 


AKR1B1 


-0.0675 




Lie 7A4AO 

MS, TooMc 


ALDH1 


-0.0566 


U*50l 




ALDH1 


-0,0371 


UOr 1 




ALDH1 


-0,0680 






ANXA1 


0.2062 






ANXAI 


-r0.0739 




Lto 7fl00C: 


ANXAI 


-0.0228 




Lie 0i7>10^ 


ANXA2 


0.2223 


0779 




ANXA2 


0.2080 




nS.£ 1 f 48o 


ANXA2 


0.0701 




Lie OQiO^ 

nS.?l4 iV4 


AP0A1 


0.1133 






AP0A1 


-0.0373 




Me a<l1QA 


AP0A1 


-0.0894 


0493 


Me 9^ 


A7P5B 


0.0080 


0427 


Mq 9(; 


ATDCD 
ATPOD 


0.0122 


0424 


Mn 9^ 


AlrbB 


-0,0992 


0863 


Uo 781 OA 




•"0.0483 


0780 


no. / 9 luu 


\AAj 


-0.0443 


1527 


1 1 9 1 HU 


blroA 


—0.0726 


1484 


Ue 1 iQi^n 

net. 119 1 *tu 


IT tec A 


—0.0376 


1728 


Hs,5241 


C ADDi 
rMDr 1 


"0.1916 


1712 


H8.5241 


rAHr 1 


"0.0473 


0947 


Hs 169478 




0.1745 


1232 


Hs 75207 


OLUI 


A ^^AA 

0J2249 


1229 


H5.75207 


GL01 


0 OA/^O 


1695 


Hs. 158300 


HAP1 


-0.0137 


1810 


Hs.75990 


HP 


-0,4672 


1459 


H3.75g90 


HP 


0.0802 


1458 


Hs.75990 


HP 


-0.0305 


0619 


Hs.759g0 


HP 


0.0461 


0615 ' 


Hs.75990 


HP 


-0.0034 


1250 


Hs.41707 


HSPB3 


"0.1024 


0549 


Hs79037 


HSPD1 


0.1074 


0338 


Hs.79037 


HSPD1 


0.2265 


0333 


Hs.79037 


HSPD1 


0.1383 


0331 


Hs.79037 


HSPD1 


0.1603 


2381 


H3.e5114 


KftTie 


0.2016 


0535 


Hs.65114 


KRT18 


0.1106 



Protein name 



0P18($tathmln) 
Tropomyosins 1-6 

Protease disulfide isomerase (GRP58) 
Glyceraldehyde-3-phosphale dehydrogenase 
Hsp27 . 

Triose phosphate Isomerase (TPO 

Cytokeratin 18 

0P18(Stathmin) 

0Pl8(Stathmln) 

Annexin variant I 

Cytokeratin 8 

VImentIn 

Vimentln 

Aldose reductase 
14-3-3 t) 
Annexin I 
Cytokeratin 18 
Hsp27 

Phosphollpase C (QRP58) 
Aldose reductase 
Aldose reductase 
Aldose reductase 
Aldehyde dehydrogenase 
Aldehyde dehydrogenase 
Aldehyde dehydrogenase 
Annexin variant I 
Annexin I 
Annexin I 

Upocotln (annexin II) 
Lipocotin (annexin II) 
Upocotln 

Apollpoprotein A1 (ApoAl) 

Apollpoprotein A1 (ApoAl) 

Apollpoprotein A1 (ApoAl) 

ATP synthase 0 subunit precursor 

ATP synthase p subunrt precursor 

ATP synthase p subunrt precursor 

Apollpoprotein J (ApoJ) 

Apollpoprotein J (ApoJ) 

eiF-5A 

elF-5A 

L-FABP 

L-FABP 

Qlyceraldehyde-3-phosphate dehydrogenase 

Gtyoxalase-I 

Glyoxalase-I 

Huntlngtln-assoclated protein 1 (neu»x)an 1) 

or-HaptoglobIn 

a-Haptoglobln 

a4Haptoglobin 

B-haptoglobin 

B-haptoglobln 

Hsp27 

Hsp60 

Hsp60 

Hspeo 

Hsp60 

Cytokeratin 18 
Cytokeratin 18 
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Table \\-~continued 

Correlation coefficients of protein and mRNA where multiple Isoforms were pre^nt on 2D gels 
r, correlation coefficient value > 0,2445; p < 0.05. Values in Iwldface are significant at p < 0.05. 

Protein nanne 

Cytokeratin 18 
Cytokeratin 18 
Cytokeratin 18 
Cytokeratin 18 
Cytokeratin 8 
Cytokeratin 8 
Cytokeratin 8 
Cytokeratin 8 
0P18(St$thmln) 
PDI (proly-4-OH-B) 
PDI (proIy.4-OH-B) 
Prohibitin 
Prohrbitin 
rt-1-Antitripsin 
Of-1-Antitripsin 
a-1-Antltripsln 

Pulmonary surfactant-associated protein 
Pulmonary surfactant-associated protein 
Troponin T 
Troponin T 

Triose phosphate tsomerase (TPQ 
Triose phosphate isomerase (TPI) 
Triose phosphate Isomerase (TPI) 
Triose phosphate Ison^erase (TPQ 
Triose phosphate Isonrierase (TPI) 
Triose phosphate Isomerase (TPO 
Tropomysin clean-product 
Cytoskeletal tropomyosin . 
Tropomyosin 
Tropomyosins 1-5 
Transthyretin 
Transthyretin multlmere 

Ubiquitin carboxyl-termlnal hydrolase Isozyme L1 
Ubiqurtin carboxyl-tenninal hydrolase isozyme LI 
Ubiquitin carboxyl-termlnal hydrolase isozyme L1 
Vimentin 

VImentin-derlved protein (vid4) 
Vimentin-derived protein (vid2) 
Vimentln-derlved protein {v!d1) 
14-3-3 1) 



Spot 


Unlgene 


Gene name 


r* 


0529 


Hs.65114 


KRT18 


0.1279 


0528 


Hs.65114 


KRT18 


0.0414 


0527 


Hs.65114 


KRT18 


0.0436 


0514 


Hs.65114 


KRT18 


0.0733 


0451 


Hs.242463 


KRT8 


—0,01 11 


0446 


Hs,242463 


KRT8 


0.0347 


0444 


Hs.242463 


KRT8 


™0 1311 


0443 


Hs.242463 


KRT8 




1488 


Hs.81916 


LAP18 




0321 


Hs.75655 


P4HB 


—0,0546 


0320 


HS75655 


P4HB 


UtUvt 1 


1063 


Hs.75323 


PHB 


0 0441 


0837 


H8.76323 


PHB 




0326 


H8.297681 


SERPINA1 


— O C\007 


0322 


Hs.297681 


SERPiNIAl 


—0 0077 


0241 


Hs 297681 




—0 014fi 


1280 


Hs 301254 


SFTPA1 




1278 


Hs.301254 


SFTPA1 


— O 00 AO 


0866 


Hs 73980 


TNMT1 


n 1 1 AO 

U.I 1 D£ 


0778 


H8.73980 






1213 


Hs.83848 


TPII 


O AOOA 


1210 


Hs.83848 


TPI1 


A (\A<xr\ 


1207 


Hs 83848 


TPII 


— U, IDIO 


1204 


Hs 83848 


TPI1 




1202 


Hs 83848 


TPM 


U.U/ii 1 


1161 


Hs 83848 


TPI1 
1 r 1 1 




1052 


H$.77899 


TPM1 


-0,1040 


1039 


Hs.77899 


TPM1 


-0.2999 


1035 


Hs.77899 


TPM1 


-0,3821 


0783 


Hs.77899 


TPM1 


0,0757 


1574 


Hs.194366 


TTR 


--0.0065 


0809 


HS.1 94366 


TTR 


0.0399 


2202 


Hs.76118 


UCHL1 


-0.0220 


1246 


Hs.76118 


UCHL1 


-0.1261 


1242 


Hs.76118 


UCHL1 


0.1473 


0606 


Hs.297753 


VIM 


0.0951 


0594 


H8.297753 


VIM 


-0.2664 


0508 


Hs.297753 


VIM 


0,1008 


0419 


Hs.297753 


VIM 


0.0032 


1279 


Hs.75544 


YWHAH 


0.0059 



such that pp(f) denotes the rth largest con*eiatlon coefficient for pth 
pennutation. Hence, the expected corrBlation coefficient, was the 
average ov«-the 60 pennutations, ptf) = ^ |0p(/y60. A scatter plot of 
observed correlations WO) versus the expected conielatbns Is shown in 
Fig. 20. For this study, we chose threshold A = 0, 1 1 6 so that corrBlation 
would be oonsldersd significant If absolute value of difference between 
p(0 and p£(/) was greater than the thnsshold. Twenty-nine (including one 
with observed conBlallon coefficient -0.4672) of 1 66 pairs of gene and 
protein expression were called significant in such criteria, and the 
permuted data generated an average of 5.1 falsely significant pairs of 
gene and protein expressioa This provided an estimated false dis- 
covery rate (the percentage of pairs of gene and protein expression 
identified by chance) for our data set. 

RESULTS 

Correlation of Individual Proteins and mRNA Expression 
within Each Tbmor-We iiave examined quantitatively 165 



protein spots on 2D gels representing 98 genes and com- 
pared protein levels with mRNA levels for a cohort of 85 lung 
adenocarcinomas and normal lung samples. Of the 165 pro- 
tein spots, 69 proteins were represented by only one knovwi 
spot on 2D gels for an individual gene, whereas 96 protein 
spots showed multiple protein products from 29 different 
genes. 2D Western blotting verified the proteins identified by 
mass spectrometry when specific antibodies were available. 
Spearman con-elation coefficients of the proteins and their 
associated mRNA for each protein spot were generated using 
all 76 lung adenocarcinomas and nine non-neoplastic lung 
tissues (see Tables I and II. and see Figs. 1 and 2). The 
correlation coefficients W ranged from -0.467 to 0.442 (Fig. 
2D). A total of 28 protein spots (21 genes) were found to have 
a statistically significant correlation between expression of 
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Fig. 1 , At digital Image of a sllver-stalned 2D-PAGE separation of a stage I lung adenocarcinoma showing protein spots separated by 
molecular mass (/WW) and Isoelectric point (P/). Twenty-eight protein spots whose expression levels are correlated wHh mRNA abundance are 
indicated by the black arrows, e, the outlined areas of A showing protein GRP58. C, 2D Western blot of QRP68 from the A549 lung 
adenocarcinoma cell line. D, the outlined areas of A showing the protein isoforms of Op18. E, 2D Western blot of Op18 f^om A649 cells. 



' their protein and mRNA (r > 0.2445; p < 0.05). This accounts 
for 17% (28/165) of the 165 protein spots. Among the 69 
genes for which only a single protein spot was known CTable 
I), nine genes (9/69, 13%) were obsen/ed to show a statisti- 
cally significant relationship between protein and mRNA 
abundance (r > 0.2445; p < 0.05). The proteins whose ex- 
pression levels were correlated with their mRNA abundance 
included those Involved in signal transduction, carbohydrate 
metabolism, apoptosis, protein post-translatlonal modifica- 
tion, structural proteins, and heat shock proteins (Table III). 

Individual isofonvs of the Same Protein Have Different 
Proteln/mRNA Correlation Coefftcients—Of the 165 protein 
spots, 96 represent protein products of 29 genes with at least 
two isofomns. Among these 96 protein spots, 19 (19/96 pro- 
tein spots, 20%) showed a statistically significant con-elation 
between their protein and mRNA expression if > 0.2445; p < 
0.05) ffable II) and represented 12 genes (12/29, 41%). Individ- 
ual Isofomns of the same protein demonstrated different 
proteln/mRNAcon-elation coefficients. For example, 2D-PAQE/ 
Westem analysis revealed four isofomis of 0P18 differing in 
regards to isoelectric point but similar In molecular weight. 
Three of the four isofomns (spots 1 492, 1493, and 1 494) showed 
a statistically significant con-elation betwiaen their protein and 
mRNA abundance (r 0.3234, 0.3154, and 0.4003. respective- 
ly). The forth isofomi (spot 1488) showed no con-elaWon be- 



tween protein and mRNA expression (r = 0,0495). Similariy, just 
one of five quantified Isofomns of cytokeratin 8 (spot 439) dem- 
onstrated a statistically significant conelation between protein 
and mRNA abundance (r - 0.3049; p < 0,05) (Table II). 

In addition to differences in the relationship between mRNA 
levels and protein expression among separate Isoforms, some 
genes with very comparable mRNA levels showed a 24-fold 
difference In their protein expression. Genes with comparable 
protein expression levels also showed up to a 28-foId vari- 
ance in their mRNA levels. 

Lac/f of Conelation for mRNA and Protein Expression wtten 
Using Average Tumor Values acmss All 165 Protein Spots (98 
GenesJ—The relationship between mRNA and protein expres- 
sion was also examined by using the average expression 
values for all samples. To analyze this relationship using this 
approach, the average value for each protein or mRNA was 
generated using all 85 lung tissue samples. The range of 
normalized average protein values ranged from -0.0646 to 
0.0979 (raw value 0.0036 to 4.1947), and the range for mRNA 
was from 0 to 15260.5 for all 165 individual protein spots. The 
Spearman correlation coefficient for the whole data set (165 
protein spots/98 genes) was -0.025 (Rg. 3A). Even for the 28 
protein spots (Fig. 2D) that were found to have a statistically 
significant con-elation between their mRNA and protein, use of 
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FiQ. 2. A-C, plots showing the correlation between mRNA and protein for the three selected genes Op18, Annexin IV. and GAPD for all 76 
lur^ adenocarcinomas and nine non-neoplastic lung samples (p < 0.05). D, distribution of all 165 Speannan correlation coefficients (rt and 
verification analysis using SAM. A more detailed description of the method Is provided under "Experimental Procedures." Approximately 1 7% 
of the 166 proteins demonstrate a significant correlation between mRNA and protein levels as demonstrated by the values shown beyond the 
outer range of threshold A - 0.1 15. Nonnallzed protein values were used, thus negative values for some proteins am observed 



the average value resulted In a correlation coefficient value of 
-0.035» which was not significant (Rg, 3S). 

Lack of a Relationship between Protein/mRNA Correlation 
Coefficients and Average Protein Abundance~lo determine 
whether an absolute protein level might influence the con-e- 
lation with mRNA, the mean value of each protein (relative 
abundance) and the Speannan protein/mRNA correlation co- 
efficients among ail 85 samples were examined. No relation- 
ship between the protein abundance and the correlation co- 
efficients was obsen/ed (r = 0.039; p > 0,05). A detailed 
analysis of separate subsets of proteins with differing levels of 
abundance fless than -0,0014. larger than -0.0014, or larger 
than 0,0077) also showed a lack of congelation between mRNA 
and protein expression among the 83 (50%), 82 (50%), and 41 
(25%) of 165 total protein spots, respectively (r = 0.016, 0.08, 
and 0.172, respectively). 

Stage-related Changes in the Protein/mRNA Correlation 
Coefflcients—Jo determine whether the 21 genes (28 protein 
spots) showing a significant congelation between the protein 
and mRNA expression among all samples demonstrate 
changes in this relationship during tumor progression, the 
con-eiatlons were examined separately for stage I {n - 57) and 



stage III (r? = 19) lung adenocarcinomas (Table III). The num- 
ber of non-neoplastic lung samples (f? = 9) was insufficient for 
a separate correlation analysis of this group. Many of the 
protein spots represent one of several known protein Isoforms 
for a given gene. The majority of genes (16/21) did not differ In 
the protein/mRNA correlation between stage I and stage III 
tumors indicating a similar regulatory relationship between the 
mRNA and protein spot. GRP-58, PSMC, S0D1, TPil. and 
VIM, however, were found to demonstrate significant differ- 
ences in the correlation coefficlGnts between stage I and 
stage III lung adenocarcinomas. For QRP-58, PSMC. and VIM 
the change in the congelation coefficient was because of a 
relative Increase In protein expression in stage lit tumors. For 
SOD and TPI the change resulted from a relative decrease in 
expression of this specific protein in stage III tumors. 

DISCUSSION 

Relatively little is knovm about the regulatory mechanisms 
controlling the complex patterns of protein abundance and 
post-translational modification in tumors. Most reports con- 
cerning the regulation of protein translation have focused on 
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Tasle III 

stage-dependent analysis of prat^n-mRNA cormlation coefficients 
r. correlation coefficient. Values in boldface Indicate a significant difference between stage I and stage 111. 
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one or several protein products (18). CJells et a/. (19) found a 
good correlation between transcript and protein levels among 
40 well resolved, abundant proteins using a proteomic and 
mlcroarray study of bladder cancer. By comparing the mRNA 
and protein expression levels within the same tumor samples, 
we found that 17% (28/165) of the protein spots (21/98 genes) 
show a statistically significant correlation between mRNA and 
protein. These proteins appear to represent a diverse group of 
gene products and Include those involved in signal transduc- 
tion, carbohydrate metaboflsnri. protein modiflcatlon, cefl stmc- 
ture. heat shock, and apoptosls. These results suggest that 
expression of this subset of 1 65 proteins is lil<ely to be regulated 
at the transcriptional level in these tissues. The majority of the 
protein isofomr»s, however, did not con^elate with mRNA levels, 
and thus their expression Is regulated by other mechanisms. We 
also observed a subset of proteins that demonstrated a nega- 
tive correlation v\^h the mRNA expression values; for example 
a-haptoglobin demonstrated a strong negative correlation with 
its mRNA expression values. This may reflect negative feedback 
on the mRNA or the protein or the presence of other regulatory 
influences that are not understood currently. 

Post-translatlonal modification or processing will result in 
individual protein products of the same gene migrating to 
different locations on 2D-PAGE gels (20). Because the identity 
of all possible isoforms for each protein examined has not 
been characterized completely, this may influence the corre- 
lation analyses perfomned in this study. This Is partly because 
of limitations of the 2D-PAQE and mass spectrometry tech- 
nologies (21, 22). Potential inconsistencies between mRNA 
and protein con-elations that have been reported may also be 
because of differences, even in the same gene, In the mech- 
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anisms of protein translation among different cells or as 
measured in different laboratories (23), 

In this study, we examined 165 protein spots identified in 
lung adenocarcinomas. Ninety-six protein s|Dots, representing 
the products of 29 genes, contained at least two protein 
isoforms. Nineteen of 96 protein spots, representing 12 
genes, were shown to have a statlstlbally significant correla- 
tion between their protein and mRNA expression, suggesting 
that the levels of these protelrw reflects the transcription of the 
con-esponding genes. Differences in protein/mRNAconrelations 
were found among the individual Isofomns of a given protein. For 
example, of the four 0P1 8 Isofomns, three showed a statisticaDy 
significant con-elation iaetween the protein and mRNA expres- 
sion levels. The lack of relationship for the one isoform, how- 
ever, indicates that individual protein isoforms of the same gene 
product can be regulated dtfferentiaily. This is not unexpected 
and lil<ely reflects other post-translatlonal mechanisms that can 
influence isofomn abundance in tissues and cancer. 

In addition to the analyses of the correlation of mRNA/ 
protein within the same tumor samples, we also tested the 
global relationship between mRNA and the corresponding 
protein abundance across all 165 protein spots in the lung 
samples. A protein and mRNA average value for each gene 
was generated using all 85 lung tissues samples. We ob- 
sen/ed a very wide range of normalized average protein and 
mRNA values. The correlation coefficient generated using this 
average value data set was -0.025, and even for the 28 
protein spots that showed a statistically significant correlation 
between Individual mRNA and proteins, the correlation value 
was only -0.035. This suggests that It is not possible to 
predict overall protein expression levels based on average 
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FiQ. 3. The overall correlation of 
mRNA and protein levels across all 
166 protein spots (4) and across 28 
protein spots that contained Individ- 
ual r values larger than 0.244 (8) are 
shown. Each protein or mRNA mean 
value was calculated based on all 76 
lung adenocarcinomas and nine non- 
neoplastic lung samples using quantita- 
tive 2D-PAGE and Afiymetrlx oligonu- 
cleotide microarrays. The Spearman 
correlation coefficients for the hvo data 
sets C4 and B) were -0.025 and -0.035, 
respectively, Indicating a tack of confla- 
tion If mean values for mRNA and protein 
for all samples is used. 
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nr>RNA abundance In lung cancer sannples. This conclusion is 
also supported by previous results from Anderson and Seil- 
hamer (24), who examined 19 genes in human liver cells, and 
by Gygl ef a/. (25), vA\o examined 106 genes In yeast. Both 
studies found a lack of congelation between mRNA and protein 
expression when average or overall levels were used. 

A good correlation was reported when the 11 most abun- 
dant proteins were examined In yeast (25). suggesting that the 
level of protein abundance may be a factor that may influence 
the correlation between mRNA and protein. In the present 
study, a fairly wide range of mean protein values among 165 
protein spots In lung adenocarcinomas was observed, and 
the con-elation coefficients also varied from -0.467 to 0.442. 



A comparison between the mean value of each protein arvd 
the correlation coefficient generated using all 85 tissue sam- 
ples did not reveal a strong relationship between the overall 
protein abundance and the conelatlon coefficients (r = 0.039; 
p > 0.05). Detailed analysis of different subsets of protein abun- 
dance also failed to show a con-elation between mRNA and 
protein expression. Thus in contrast to yeast, a relationsNp 
between mRNA/protein con-elation coefficient and protein 
abundance In human lung adenocarcinomas was not observed. 

The results of this study Indicate that the level of protein 
abundance in lung adenocarcinomas is associated with the 
con-esponding levels of mRNA in 17% (28 protefris) of the 
total 165 protein spots examined. This was substantially 
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higher than the amount predicted to result by chance alone 
(which was 5.1) and suggests that a transcriptional nDecha- 
nism likely underlies the abundance of these proteins in lung 
adenocarcinomas. We also demonstrate that the expression 
of individual Isoforms of the same protein may or may not 
con-elate with the mRNA, Indicating that separate and likely 
post-translational mechanisms account for the regulation of 
Isofomi abundance. These mechanisms may also account for 
the differences In the correlation coefficients obsen/ed between 
stage I and stage 111 tumors, indicating that spedfic protein 
Isofomns show regulatory changes during tumor progression. 
Further studies in lung adenocarcinomas will examine the rela- 
tionship between the expressfon of individual protein Isoforms 
and specific clinical-pathological features of these tumors, such 
as the presence of angiolymphatic invasion, and nodal or pleu- 
ral surface Involvement. The potential to Identify specWc protein 
Isofomis associated with biological behavior in lung adenocar- 
cinomas would be of considerable interest and will add to our 
understarvdlng of the regulation of gene products by transcrlp- 
ttonal, translational, and post-translatlonal mechanisms. 
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Abstbact: Epithelial turaora develop through a multistep process driven by 
genomic instability freqoentiy associated with etioiogic agents such as pro- 
longed tobacco smoke exposure or human papilloma vims (HPV) infectton. 
The purpose of the studies reported here was to examine the aatare of genomic 
Instability hi epithelial tissues at cancer risk in order to identify tissue genetic 
biomairkers that mlg^t be used to, assess an faidividnal's cancer risk and 
response to chemopreventive intervention. As part of several chemopreventlon 
trials, biopsies were obtained from risk tissues (l^ bronchial biopsies from 
chronic smokers, oral or laryngeal biopsies from individuals with premalig* 
nan<7) and examined for chi^mosome Instability using insiiu hybridization. 
Nfeariy all bio|>sy specimens show evi dence for chromosome histability 
tbrougtiont tne exposea tissue, increasea chromosome tnstabiUty Was 6b&trveid 
iHth hlstblbgie progression fai the normal to tumor transition of bead and neck 
squamous cell carcinomas. Chromosome instability was also seen in premallg- 
nant head and n^ lesions, and high levelis were associated wilh snbseipicttt 
tumor development In bronchial biopsies of current sniokers, the level of 
ongoing chromosome instability corteUtcd vrith .nnoking hntendty (eg., 
packs/day), whereas the chromosome index (avenfge nuiiiber of chromosome 
copies per cell) correlated with cumulative tobacco exposure pack-years). 
Spatial chromosome analyses of the epithelium demonstrated mnltlfocal clonal 
outgrowths. In former smokers, random chroraosoine InstAbility was reduced; 
however, clonal populations appeared to persist for many years, perhaps 
accounting for continued luaig cancer risk following smoking cessation. 

Keywords: chromosome instability; epithelial cells; aerodigestlve tract; 
chemopreyention; canceir risk 



THE NEED FOR BI0MARKER5 OF CANCER RtSK AND 
RESPONSE TO INTERVENTION 

Epithelial cancers remain a major health challenge in the yrarld. Despite improve* 
meots in staging and the appUcadon and intention of surgery, -radiotherapy, and 
chemotherapy, the 5-year survival rate for individuals with lung cancer is only about 
15%.' Even if strategies for eariy detection are successful and long caticers 
are detected at a stage where local tumor resection and treatment is curative, 
these patients will still be at significant risk for developing second priniary tumors 
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ai^sodated with the proUem of field cancerizadon.^ Similarly, for individuals wiA a 
fir^t head and neck priinaiy tumor, eYcn if the first malignancy is successfully treat- 
ed, the risk of developing a second primary in the tobacco smoke-exposed field is 
approximately 40%.^ Shnilar cancer risk estimates exist for individuals who exhibit 
severe dysplasia in pfcmalignant epifljelial lesions."* For these reasons, it is important 
tb focus on chemoprcvenlive strategies to prevent the development of epithelial 
malignancies. 

. Several problems confront chemoprevention trials designed to identify effica- 
cious agents.^ I^rst, chemoprevention trials with cancer incidence as a primary endr 
(>oint require tens of thousands of subjects and tens of years of intervention and 
follow-up for statistical evaluadon. For cxanq)le, a recentty reported trial involved 
30,000 subjects and required 10 years in order to examine the impact of prevention 
strategies on lung cancer development, only to find a possible increased lung cancer 
incidence in current smokers who received ^arotene.^ 

The problem of large, long-term trials results from the difficulty in identifying 
individuals at highest cancer risk who might best benefit from chemopreventive 
intervention. For example, 20 pack-year smokers, while known to be at relatively 
increased risk for developing lung cancer, have approximately a 10% lifetime risk 
for developing lung cancer.^This seriously limits the number of potentially useful 
strat ftffifA that rnn hp rlinimlly ftTp lnrp/i A ft^^^ d p ^^lft m f ar iii £ r hf m n pr rvf*n ti n n _ 
trials is that little is known about what agents are likely to have efficacy, and even 
less is known regarding proper doses, schedules, and durations of treatment Part of 
the reason for this problem is that too little is known about the physiologic processes 
that drive epitfaeliid cancer development 

In order to reduce the number of subjects and the time required to carry out 
chemoprevention trials and thus allow the exploration of multiple prevention strate- 
gies, two types of advances are necessary. First, it is important to identify individuals 
at significantly increased cancer risk who might best benefit firom different types of 
intCTvention. Second, in order to allow the rapid identification of agents, doses, and 
schedules of potentially efficacious agents, it is necessary to identify and validate 
surrogate endpoints of response that indicate whether die agents are having a posi- 
tive impact on the target tissue during die chemopreventive intervention. 

One ^proach to identifying individuals at increased amdigestive tract cancer 
risk is to explore epidemiologic features of potential subjects. Molecular epidemio- 
logic stodies arc beginning to identify intrinsic host factors that place some individ- 
uals at increased cancer risk, especially those widi a chronic smoking history.^ Most 
intrinsic factors identified thus far reflect levels of carcinogen metabolism, repair 
capabilities of the host following DNA damage, and other measures of intrinsic 
cellular schsitivify to rnutagcns^ While diese factors can proviifeVtatisficaUy VigniP " 
icant risk ratios in case-ctotrol studies thai are controlled for tobacco exposure, the 
detected risk ratios usually fall in the range of 1.5 to 10. Unfortunately, this is not 
sufficient for die individualization of treatment and is not sufficienUy high tb signif- 
icantiy reduce the numbers of subjects required for chemoprevention dials witii 
cancer incidence as the primary end^int. 

Another approach to identifying individuals at increased cancer risk is to directly 
exanune the target tissue of individuals with known carcinogen exposure (e.g., 
chronic tobacco smoke exposure), who have evidence of target organ dysfunction 
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(cjg,. chronic obstructive pulmonary disease, changes in voice quality), of who 
have clinical evidence of premalignancy (c.g.. bronchial mctaplasia/dysplasia. oral 
leukoplalda/crythroplakia, cervical intra^itfieiial neoplasia). The conventional 
standard for assessing cancer risk in these situations is the d^ice of histological 
change. However, while individuals who show moderate jto severe dysplasia are 
known to be at increased cancer risk when compared to individuals with lesser his- 
tologic changes, it is often difficult to distinguish reactive changes to carcinogenic 
insult from initiated and progressing lesions. Similarly, upon cessation of carcino- 
genic insult, histologic changes may reverse yet cancer risk may continue for many 
years. For example, while smokmg cessation is associated vwth decreased bronchial 
metaplasiajl' increased lung cancer risk continues for many years beyond smoking 
cessation.'** In fact, nearly half the newly diagnosed lung cancer cases in the USA 
occur in former smokers. ' ' 

The development of assays to identify individuals at high epithelial cancer risk 
and to directly assess response to intervention in the target tissue is therefore an 
important research goal. Such assays should be objective and easily quantifiable and, 
if possible, minimally invasive. Moreover, thqr should reflect both the disease pnh 
cess and the targeted pathway aiid thereby be useful in assessing risk and monitoring 
response to intervention as well as directly testing die hypothesized mechanism of 
-^fetion-of4b&<hemop raventiv e steategy.— 

In the chemoprevendon setting it is important to recognize that one docs not 
know the location of the future cancer. Thus, assays must necessarily be carried out 
on random biopsies of the field at risk. Even if there are clinically evident premalig- 
nant lesions, this does not mean that this is die likely site for a future malignancy. 
For example, nearly half of the cancers that develop in individuals with oral leuko- 
plakia arise away from the original index lesion. Similarly, since many newly diag- 
nosed lung cancers arise in the peripheral parts of die lung (e.g.. adenocarcinomas), 
especially ih former smokers, and since endobronchoscopy predominantly accesses 
central components of the lung, it is iraporUnt to identify biomaricers that can reflect 
global processes ongoing in die target epithelial field associated widi increased can- 
cer risk. Their discovery requires a better understanding of the tumorigencsis pro- 
cess in epidielial fields at cancer risk. 



THE RATIONALE FOR STUDYING 
GENOMIC INSTABILITY AS A MARKER O** RISK 

T^ors of the aerodigestive tract have been proposed to rc^ 
tion" process whereby die whole tissue is exposed to carcinogenic insult {e.g., tob- 
acco sifioke) aiid is at increased risk for mulristep tumor development *2.13 Scleral 
types of clinical and laboratory data support diis notion, including the frequent 
occurr«ice of synchronous primary and subsequent second primary tumors in die 
aerodigcstive tract (frequendy exhibiting dissimilar histologies as well as distinct 
genetic signatures ''*-'^) and die presence of premalignant lesions diat precede and/or 
accompany tiie tumor in die exposed tissue field." The notion of a multistcp tumor- 
igencsis process is further supported by serial clinical and histologic evaluations of 
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target tissue or cTifoliated cells where increasing degrees of histological abnonnali- 
ties are observed over time.^^ 

A working model for aerodigestive tract tumorigenesis is illustrated in FIourb 1. 
'Rimor^gcnesis in the face of carcinogenic exposure likely involves a chronic process 
of tissue injury and wound healing. DNA damage induced by the carcinogen is likdy 
fixed into permanent geri^c changes (c.g., chromosome damage, chromosome non- 
disjunction, gene mutation, gene deletion, etc.) durinjg the process of proliferation. 
This damage would be expected to be distributed dut)ughout the exposed tissue field 
leading to a background of generalized genomic damage (depicted in FkouRE 1 as a 
background mat of increasing density). Chronic injury and repair likely leads to the 
accumulation of cells with increasing amounts of genetic changes as well as die out- 
growth of abnormal clones (triangles in Rgurb 1) carrying an accumulation of 
genetic changes important for selective survival, dysregulatcd growtii, and preferen- 
tial epithelial take-over by initiated clones (see PicuitB 2). 

CcUular and molecular evidence for the field carcinogenesis and muUistep tum- 
origenesis model comes from many laboratories, With the advent of a wide array 
of molecular technologies, a large number of specific molecular genetic and epige- 
nctic changes involving speciiSc oncogenes, tumor suppressor genes, cell regulatory 
genes, and repair genes have now been described for aerodigestive tract cancers. The 
i dent ifi ca ti nn of these snecific molecular rJmn gwi have now provided p robes 
explore specific events occurring m prcmalignant lesions adjacent to aerodigestive 
tract tumors.^^-^ FrcquenUy, thest preniialigmmt lesions showed a subset of die 
same molecular changes found in die associated tumor, suggesting tiiat these lesions 
might represent precursor lesions for tiie associated tumors (i.e^, a manifestation of 
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FIGURE 1. Field canccrization and multislep tumorigenesis. 
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FIGURE 2. Multiple focal clonal evolution during muUistep tumorigenesis. 



a multistep tumorigenesis process). For example, studies of the preraalignant lesions 
adjacent to head and neck tumors have provided evidence for a gradual accumulation 
of genetic alterations accompanied by evidence for dysregulation of cellular control 
mechanisms (e.g., alterations in expression of PCNA, EGFR, TGF-B, p53, and 
cyclin Dl).25-28 

These types of studies have now also been applied to the target epithelium of indi- 
viduals at increased risk for acrodigestive tract cancer (i,c., individuals with a chron- 
ic smoking/alcohol history and/or prior acrodigestive tract cancer). Several groups 
(using polymerase chain reaction. PGR, analysis of microdissected epithelium) have 
now demonstrated the presence of clonal outgrowths in the target premalignant epi- 
thelium of individuals at increased risk for canccr.^^' For example* examination of 
bronchial biopsies derived frdm individuals with a 20 pack-year smoking history 
demonstrated that 76% of the cases showed evidence for LOH (3pl4, 9p21, or 
I7pl3) in at least one of six lung biopsy sites. On a per site basis, some form of LOH 
was observed in 25% of the sites examined.^' 

If aerodlgestive tract cancer development reflects a field canccrization pfocess 
involviftg multistep events, then risk and response information should beablc to be 
derived from random biopsies or exfoliated cells from the field at risk or from assess- 
ments of tissue undergoing similar processes. Hypothetically, lesions exhibiting the 
greatest degree of genomic instability, clonal outgrowth, and abnormal epithelial 
regulation would be at the highest relative aerodigestive tract cancer risk. Similarly, 
an active chemopreventive intervention might be expected to decrease these mani- 
festations of risk. Reduced risk manifestations include decreased tevels of ongoing 
genetic instability, decreased frequency of clonal outgrowths, aiid increased epithe- 
lial growth regulation. 
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THE MEAiSUREMENT OF CHROMOSOME INSTABn>ITY USING 
CHROMOSOME W 5/ri; HYBRTOIZATION 

Molpcular genetic techniques, while extremely useful for detecting clonal chang- 
es in targets tissues, are somewhat limited in their ability to detect random genetic 
inistability. Cpnventional cytogenetic assays are useful for detectmg chromosome 
instability and clonal chromosome changes. However, they require numbws of 
ctividing cells for karyotypic analysis that, are difficult to attain in the setting of biop- 
sies acquired during die course of a chcmoprcvention trial. A technique was there- 
fore needed diat would allow chromosome instability measurements in situations 
where few cells are available (e.g. small biopsies, brushings, or sputum samples) and 
where the target material might be fixed. It was also desirable to have a technique 
that would be adaptable to dssue sections, whereby spatial information could be 
retained and genotype/phenotype associations could be determined on the same or 
adjacent tissue sections. The technique of in situ hybridization (ISH) involves the 
use of DMA probes that recognize either chromosome-specific repetitive target 
sequences, chromosome single gene copy sequences, or sequences along the whole 
chromosome length or chromosome segments.^^ We have adapted the ISH technique 
for formalin-fixed, paraffin-embedded tissue sections and have applied it to a variety 

of tissues, including the aerodiBestive tracc^*^ 

Using probes that label the centromere regions of specific chromosomes, this 
assay permits determination Of the average chromosome number per cell for each 
specimen. TTiis assay is also useful for detecting generalized chromosome instability 
during die tiimorigencsis process. Normal diploid populations should have two cop- 
ies of each autosomal chromosome and should rarely show three or more chromo- 
some copies per cell (chromosome polysomy), especially in dssue sections where 
nuclear truncadon results in an undcr-representation of chromosome copy number. 
Thus, the detection of cells with three or more chromosome copies would indicate 
the presence of chromosome instability. 

To examine this technique's potential for characterizing the multistep tumorigen- 
esis process in the aerodigestive tract, we measured die fraction of cells exhibiting 
tiir6e or more chromosome copies in apparenUy contiguous epithelial transitions 
from noiinal to hyperplastic to dysplastic to carcinomas, all on a single tissue slice 
of head and neck squamous cell carcinomas.^ In these specimens, greater than 35% 
of die cases of adjacent "normal" epitheliutti, greater than 65% of die cases of hyper- 
plastic epiUielium, and greater than 95% of die dysplastic and tumor regions showed 
evidence of chromosome polysomy. Of interest, similar transitions of chromosome 
instability were obSierved with at least four differral chromosome probes. Similar 
trends have also be^^bsored in aincnable tisstLe from, other eptheiiaj mflljgnqn- 
cies, including cervix, bladder, and breast^^ These results thus suggested that the 
notions of field cancerization and multistep tumorigenesis might apply to several 
epitiielial tissues and tiiat measures of chromosome instability might be useful for 
monitoring this process. 

In die situations described above, the premalignant lesions examined might be 
considered to represent epitheiiom at 100% risk of being in a cancer field, since diey 
were located in die adjacent epidielium to die twicer, Hiis then raises Uie question 
of the nature of genetic instability in the epidielium of individuals at increased risk 
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for developing cancer. To explore this issue, we obtained biopsies doling the course 
of leukoplakia chemoprevention trials exploring the use of 13<fj-rednoic acid in 
reversing leukoplakia and probed them for genetic instability using in situ hybridiza- 
tion. In one retrospective study and in one prospective study of subjects with oral* 
leukoplakia, the results indicate chat diose subjects whose pretreatment biopsies har- 
bor relatively high levels of genomic instability (i,c., more tlun 3% of the cells 
examined showing at least 3 chromosome 9 copies per cell) have a significantly 
higher likelihood df suffCTing early onset of head and neck canccr.^^^ Interestingly, 
half of the tumors that did develop occurred away from the biopsy site used to mea- 
sure genetic instability. Hus result suggests that genomic instability measurements 
in carcinogen-exposed tissue can provide useful cancer risk estimates. 



THE RELATIONSHIP BETWEEN TDBACCO EXPOSURE AND 
CHROMOSOME INSTABILITY 

In recent years, the aerodigestive tract chemoprevention group at M.D. Anderson 
Cancer Center has initiated three sequential biomarker-associated chemoprevention 
trials involving duronic smokers widi a greater than 20 pack-year smoking history. 
-4n-cach.of^ese-gtudiesr«idobrenchialbiopsies wer e obtained-feoro^gix^iefinecUites- 
within the lung, including the carina and at bifurcation points at the upper, middle, 
and lower right lung and at the upper and lower left lung. Biopsies were obtained pri- 
or to and following cheroopreventive intervention and were subjected to in situ 
hybridization analysis in addition to analyses for other biomaricers. The first impor- 
tant fincfing was that some degree of chromosome polysomy was evident in alL lung 
sites examined, and this was observed independently of the particular chromosome 
probe utilized.^* This finding supports the notion that random chromosome changes 
may be occuning throughout the exposed lung field. 

In a second study, bronchial biopsies were obtained from individuals with a 20 
pack-year smoking history. In this study, most of the subjects involved were current 
smokers.^' Interestingly, all cases who showed metaplasia at one of six biopsy sites 
also showed chromosome polysomy in at least one biopsy site; overall, 88% of the 
sites showed some evidence of chromosome 9 polysomy.^ Evidence for genetic 
instability was also detected in patients who did not show evidence of bronchial 
metaplasia in any of six biopsy sites despite a strong smoking history. In fact, more 
than 90% of the cases and more than 60% of the sites showed significant chromo- 
some polysomy (i.e.. at least three copies in at least 2 % of the cells examined). 
These reisults suggest that the lungs of long-term sniokers show significant evidence 
of genetic instability. and_thls,instability_can be jletected Jhroughout .the_acccssible__ 
bronchial tree; even when bronchial metaplasia is not evident. . 

These studies in current smokers has a]lowe4 us to examine the relationship 
between the levels of genetic instability detected and subject charactfcristics such as 
smoking status (current or former), smoking history, and lung tissue pathologic 
changes. Evaluable biopsy material has now been obtained from more than 108 cur- 
rent smokers, including more than 480 evaluable biopsy sites. The mean metaplasia 
index in these cuirent smokers was 30.4%. For the total population studied, the 
median chromosome index for the bronchial biopsies was 1.41 (range, 1.04-1.61) 
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and the median chrbmtwome polysemy index was 2.0% (range 0-«.7%). This can be 
eompared to a mean chitomosome indec between 1.2-1.4 for lymphocytes and very 
fare chromosome polysomy. toterestingly. the intrasubject variability in chromo- 
some instability was relatively low in most subjects and was less than the intersub- 
jcct variability. Hiese results suggested that chronic smokers haibor detectable 
chromosome instability throughout the accessible bronchia] tree (supporting the 
field carcinogenesis notion) aind that information from one biopsy site might yield 
representative information for the rest of the lung field. 

Since most of the current smokers exhibited bronchial metaplasia in at least one 
of the biopsied sites, this allowed us to examine the relationship between chromo- 
some instability and histologic changes, both on a site-by-site basis and on a per case 
. basis. On a site-by-site basis, the chromosome indices of lesions showing squamous 
metaplasia were similar to those not showing metaplasia (i.e., median 1.43 vs. 1 .43), 
and the degree of chromosome polysomy in metaplastic lesions were only slightly 
higher than in non-metaplastic sites (medians: 2.2% vs. 1.8%, respectively). Thus, 
the presence or absence of squamous metaplasia at a biopsy site does not necessarily 
correlate with the degree of underlying genomic instability. On the other hand, those 
subjects with metaplasia indices of at least 15% also showed higher levels of chro- 
mosome polysomy than did subjects with metaplasia index below 15% (medians: 
■ 2 , 4% YS, 1 ,8%. p - O.OOS). Thus, these chromosnnie instability a^i s esjiments j n CVP 
rent smokers appeared to reflect a more global process in the lung field. 

Tbbacco exposure has been shown to significantly increase the risk of developing 
lung cancer, and the degree of risk is related to the extent of tobacco exposure. We 
were interested in determining the relationship between individuals* smoking histo- 
ry parametws and the levels of chromosome change found in their lungs following 
years of tobacco exposure. While there was significant intersubject variation for sim- 
ilar tobacco exposure histories, overall there was a significant correlation between 
the degree of chromosome polysomy and the intensity of ongoing tbbacco exposure 
(packs/day, p = 0.02 on a per site basis) and with the extent of tobacco exposure 
(pack-years, p = 0.003). Thus the amount of chromosome polysomy reflects the 
intensity and extent of tobacco exposure. At the same time, individuals with similar 
smoking histories showed widely diveigent amounts of chromosome polysomy, pos- 
sibly reflecting differences in intrinsic sensitivity between subjects. There was.also 
strong correlation betwera the chromosome index and the duration of the smoking 
histoiy (smoking years) and total accumulated exposure (pack-years, p = 0.0001). 
TTiese results suggest that tobacco exposure is associated with the initiation and 
accumulation of chromosome instability in the exposed lung; however individuals 
are differentially sensitive to carcinogenic insult The working hypothesis is diat 
.^o?®J?f[|yjduds who accumiaate_gic changes will 

be at the highest lung cancer risk. 

Many of the bronchial biopsies from chronic smokers examined by in situ hybrid- 
ization showed a rise in the chromosome index above that expected for a diploid cell 
population, especially in subjects with an extensive smoking history. The rise in 
chromosome index was also accompanied by an increase in the fraction of cells 
exhibidng at least 3 chromosome copies per cell. To detcnnine if a rise in the tissue 
chromosome index was due to clonal expansion of populations with chromosome tri- 
somy, the chromosome copy number and relative coordinateis of each cell scored in 
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the bronchiaJ epithelium was recorded and a spatial jgenetic map was created.^* We 
then developed algorithms for calculating localized chromosome indices within the 
tissue. Since trisomic clones would have, on average, three chromosomes instead of 
two, those cells involved in neighboriioods with chromosome indices three-halves 
that of diploid populations could be mariced as being part of a trisomic clone. Simi- 
larly, groups of cells with chromosome indices half that of diploid populations could 
be marked as being part of a monosomic clone. Hiis allowed the generation of a sec- 
ond-order, two-dimensional genetic map representation of the bronchial epithelium 
showing the relative locations of cells involved in monosomic and trisomic clonal 
outgrowths. When adjacent tissue sections from the same bronchial biopsy wei« 
probed separately for different chromosomes, the detected clones appeared to occu- 
py separate subregions of the epithelium. This result suggests that not only arc the 
lungs of chronic smokers undergoing a process of genetic instability, they are expe- 
riencing the outgrowth of multiple clones throughout the exposed lung field, as pos- 
tulated by the models shown in Figures 1 and 2. One advantage of this clonal 
approach is that the contribution of both monosomic and multisomic clones can be 
detected. 

Since smoking cessation has been suggested to reduce the lung cancer risk, it was 
of interest to determine whether the levels of chromosome insUbility would decrease 
followin g smoking cessation. This question was possible to examine beciause our 
third sequential cheihoprevention trial involved subjects who had discontinued 
smoking. So far, more than 220 subjects (more than 650 biopsies) who have quit 
smoking (mean 9.9 quit-years) have been evaluated for chromosome instability in 
their lungs. Despite the fact that the mean metaplasia index in this group is 5.8% 
(considerably less than diat in current smokers), chromosome instability is still 
observed in the majority of subjecls.^^ While the mesm chromosome polysemy level 
is reduced to 1.0%. some individuals continue to show polysemy levels above 5%. 
Interestingly, while the overall chromosome polysemy levels were reduced in these 
individuals who stopped smoking, the mean chromosome index remained at about 
1 .4 with some individuals exhibiting chromosome indices as high as 1 .8. Initial chro- 
mosome mapping studies suggest that while random chromosome instability seems 
to decrease following smoking cessation, the clonal outgrtwths may remain for 
many years in the lung. The working hypothesis is that those individuals who show 
the greatest degree of remaining chromosome instability are at the highest lung can- 
cer risk despite sinoking cessation. Long-term follow-up on these subjects will be 
necessary to lest this hypothesis. 



— SUMMARY^mD CONCLUSIONS 

Aerodigestive tract tumorigcnesis appears to be a multistep process taking place 
throughout tiie tissue fields of exposure. When viewed in die context of chromosome 
changes, carcinogen exposure appears to be associated with the random acquisition 
of chromosome polysemy throughout the exposed field, die degree of which is relat- 
ed to die degree and extent of carcinogen exposure as well as to the instiinsic suscep- 
tibility of die exposed individual. Continued exposure leads to continued acquisition 
of new changes and. in association widi chronic wound-healing processes, to tiic 
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accumulation of clonal outgrowths throughout the taiget tissue. Although the ulti- 
mate malignancy may occur in only one or few tissue sites, manifestations of the 
instability process that drives tumorigenesis is globally present in the tissue. Thus 
random^ biopsies may provide useful risk information for the exposed field as a 
whole. Even when carcinogen exposure is reduced or chemoprcventive strategies arc 
initiated and histologic manifestations of the tumorigenesis process subside, the 
genetic scars of prior exposure remain in the form of clonal outgrowths and may 
explain continued lung cancer risk in ex-smokers. Future chemoprevention strategies 
need to focus on reducing the degree of chromosome iristability and on trying to 
eliminate residual abnormal clonal outgrowths in the aerodigestive tract In this sci- 
tihg. the measurement of chromosome instability in the target tissue will be useful m 
assessing cancer risk as well as response to intervention. 



The studies reviewed here represent one component of the collaborative efforts 
of the Aerodigestive "ftact Oiemoprevention team at The University of Texas M.D. 
Anderson Cancer Center, Houston, Texas. The studies were supported in part by 
Natten«nstitutes«fHeaIth4Jatienal-Ganecr4nstkutc-0mnts-GA-52O54T€Ar68^^ 
CA 79437, CA 16672. CA 68089. CN 25433, OA 86390. CA 70907. NIH DE 13157! 
and the State oif Texas Tobacco Research Fund. 
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